Code2Video - Show Lab open source AI teaching video generation framework
Code2Video is innovative open source project that automatically converts code snippets into high quality video content (mp4 format). The project through a unique code-centric paradigm , the use of carbon-now-cli tools to generate code into beautiful images , the use of ffmpeg will be these ...
SceneGen - Shanghai Jiaotong University open source single image to generate 3D scene framework
SceneGen is an open source method for generating 3D scenes from a single image at Shanghai Jiao Tong University. From a single scene image and a target resource mask, a complete scene containing multiple 3D resources is efficiently generated, including the geometric structure of the resources, texture and relative spatial location.
Ming-UniAudio - Ant open source unified audio multimodal generation model
Ming-UniAudio is Ant Group's open source unified audio multimodal generation model that supports mixed input and output of text, audio, image and video. Using multi-scale Transformer and hybrid expert (MoE) architecture , through modality-aware routing mechanism to efficiently handle cross-modal ...
AIMangaStudio - Free AI manga authoring tool with complete authoring flow
AIMangaStudio is a free AI manga creation tool that provides creators with a complete manga creation pipeline, including plot generation, sub-scene design, character setting and other functions, which can simplify the production process from script to manga page. It supports natural language generation of comic scripts, including plot, dialog...
FireRedChat - Little Red Book's open source full-duplex voice interaction system
FireRedChat is an open source full-duplex voice interaction system for Xiaohongshu with real-time bidirectional dialog capabilities and support for controlled interruptions. Adopts a modular design , including transcription control module , interaction module and dialogue manager , etc., supports cascade and semi-cascade architecture , can be flexibly deployed .
Logics-Parsing - Ali open source document parsing model
Logics-Parsing is an open source Ali end-to-end document parsing model , based on Qwen2.5-VL-7B. Optimize document layout analysis and reading order inference through reinforcement learning , PDF images can be converted to structured HTML output to support a variety of content ...
Ring-1T-preview - Ant Group's open-source trillion-parameter macromodel
Ring-1T-preview is an open source trillion-parameter big model of Ant Group, based on Ling 2.0 MoE architecture, pre-trained on 20T corpus, and trained in reasoning ability by self-developed reinforcement learning system ASystem. In natural language reasoning ...
RoboBrain-X0 - Wisdom Source Research Institute open source zero-sample cross ontology generalized embodiment model
RoboBrain-X0 is the world's first open source embodied model that supports zero-sample cross-ontology generalization open-sourced by Wisdom Source Research Institute, which is of great industrial significance. It can drive multiple real robots of different configurations to complete basic operation tasks without fine-tuning, and after a small amount of sample fine-tuning, it demonstrates the ability to replicate ...
Lynx - ByteHop's open source high-fidelity video generation model
Lynx is a high-fidelity personalized video generation model open-sourced by ByteDance that can generate identity-consistent videos with only a single portrait photo. Built on the diffusion Transformer (DiT) base model , the introduction of ID-adapter and Ref-adapte...
DeepSeek-V3.2-Exp - DeepSeek's latest open source experimental AI model
DeepSeek-V3.2-Exp is a DeepSeek open source experimental AI model that significantly improves the efficiency of long text processing by introducing the DeepSeek Sparse Attention (DSA) mechanism. The model is based on DeepSeek...









