Lucy Edit - open source AI video editing tool, natural language description editing
Lucy Edit is an open source AI video editing tool developed by Decart AI. Allows users to edit video through simple natural language descriptions, such as "change the character into a polar bear" or "turn the scene into a 2D cartoon style", without the need for complex fine-tuning or the use of masks ...
LongCat-Flash-Thinking - An Efficient Reasoning Model for Meituan Open Source
LongCat-Flash-Thinking is a highly efficient reasoning model released by the LongCat team at Mission LongCat that has become more powerful and specialized while maintaining the extreme speed of LongCat-Flash-Chat. The model is based on logic, math, code, intelligence...
Kronos - Tsinghua and Microsoft joint open source financial K chart base model
Kronos is the first K-line chart base model for financial markets jointly open-sourced by Tsinghua University and Microsoft Research Asia. It analyzes K-line data of stocks, cryptocurrencies and other assets, including opening, high, low, closing and volume, to predict future price movements.
Wan2.2-Animate - A Generative Model for Action Generation of the Tongyi Wanphase Open Source
Wan2.2-Animate is an open source action generation model , support for action imitation and role-playing mode . Users only need to input a character picture and a reference video , the model can migrate the video character's movements and expressions to the picture character , giving the picture character dynamic expression ...
InternVLA-A1 - Shanghai AI Lab Open Source Integration of Operational Capabilities for Embodied Large Models
InternVLA-A1 is a large model of embodied operation open-sourced by Shanghai Artificial Intelligence Laboratory. It has the ability to understand, imagine, and execute the integration, and can accurately complete the task. The model fuses real and simulated operational data, and automates the construction of massive multimodal through large-scale virtual-real hybrid scene assets...
VoxCPM - Faceted Intelligence and Tsinghua Open Source End-to-End TTS Model
VoxCPM is a speech generation model jointly open-sourced by Facade Intelligence and Shenzhen International Graduate School of Tsinghua University.VoxCPM adopts an end-to-end diffusion autoregressive architecture to generate continuous speech representations directly from text, breaking through the limitations of traditional discrete disambiguation. Through hierarchical language modeling and finite state quantization...
InternVLA-N1 - Shanghai AI Lab Open Source End-to-End Dual System Navigation Large Model
InternVLA-N1 is an open source end-to-end dual-system navigation macromodel from Shanghai Artificial Intelligence Laboratory. Using a dual-system architecture, System 2 is responsible for understanding linguistic commands and planning long-range paths, while System 1 focuses on high-frequency response and agile obstacle avoidance. The model is trained entirely based on synthetic data through large-scale digital ...
VLAC - Shanghai AI Lab's Open Source Large Model of Embodied Reward
VLAC is an open source embodied reward macromodel from Shanghai Artificial Intelligence Laboratory. Based on InternVL multimodal macromodel, it integrates Internet video data and robot operation data to provide process reward and task completion estimation for robot reinforcement learning in the real world.VLAC can effectively ...
InternVLA-M1 - Shanghai AI Lab's Open Source Embodied Dual System Operation "Brain"
InternVLA-M1 is an open-source embodied operating "brain" of Shanghai Artificial Intelligence Laboratory, which is a large model of two-system operation oriented to instruction following. It builds a complete closed loop covering "think-act-learn" and is responsible for high-level spatial reasoning and task planning. The model adopts a two-phase training cur...
PromptEnhancer - Tencent Mixed Meta Open Source AI Prompt Word Enhancement Tool
PromptEnhancer is an open source prompt word enhancement tool from Tencent's Mixed Meta team to improve the generation of text-to-image (Text-to-Image, T2I) models. Through the chain of reasoning (Chain-of-Thought, CoT) approach to the use of ...









