Matrix-3D - Kunlun World Wide open source 3D world generation framework
Matrix-3D is an open source framework from Skywork AI team, focusing on generating explorable panoramic 3D worlds. The framework combines panoramic video generation and 3D reconstruction techniques to generate high-quality, omni-directional explorable 3D worlds from a single image or text prompt...
GLM-4.5V - Multimodal Open Source Visual Reasoning Model by Smart Spectrum
GLM-4.5V is the world's leading open source visual inference model introduced by Smart Spectrum, with 106 billion total parameters and 12 billion activated parameters. The model is trained based on the new generation text base model GLM-4.5-Air, with powerful visual understanding and reasoning capabilities, capable of handling images, video...
Genie 3 - A Universal World Model from Google
Genie 3 is a next-generation universal world model from Google DeepMind that enables the generation of highly dynamic and coherent virtual worlds in real time.Genie 3 simulates physical phenomena, natural ecosystems, and supports the creation of fantasy and historical scenarios. With text prompts, users can...
Claude Opus 4.1 - The Most Powerful Programming Model from Anthropic
Claude Opus 4.1 is a state-of-the-art large-scale language model from Anthropic, designed for efficient processing of complex tasks. The model excels in the programming domain, generating high-quality code, supporting up to 32k of single output, and adapting to a wide range of programming styles...
gpt-oss - a family of open source inference models from OpenAI
gpt-oss is a family of open source inference models from OpenAI that enable efficient, flexible, and easy-to-deploy AI solutions for developers. gpt-oss consists of two versions, gpt-oss-120B with 117 billion parameters and support for 8...
MiDashengLM - Xiaomi's open source sound understanding model
MiDashengLM is Xiaomi's open source large model for efficient sound understanding, with specific parameter version MiDashengLM-7B , focusing on audio processing and understanding. The model is based on Xiaomi Dasheng audio encoder and Qwen2.5-Omn...
MOSS-TTSD - Tsinghua Lab's open source speech generation model for bilingual dialogs
MOSS-TTSD is an open source spoken dialog speech generation model developed by the Speech and Language Laboratory of Tsinghua University. MOSS-TTSD can convert text dialog scripts into natural, smooth and expressive conversational speech, and supports bilingual generation in English and Chinese.
AudioGen-Omni - Multimodal Audio Generation Model from Racer
AudioGen-Omni is a multimodal audio generation model from Racer that generates high-quality audio, speech, and songs based on inputs such as video, text, etc.AudioGen-Omni is based on advanced techniques such as multimodal diffusionTransformer and phase-aligned...
RedOne - the latest social mega-model from Little Red Book
RedOne is a large language model customized for social networks introduced by Little Red Book. The model is trained through a three-stage training strategy that incorporates social and cultural knowledge, strengthens multitasking capabilities, and aligns human preferences.RedOne significantly outperforms the base model in social task performance, in harmful content detection and browsing...
FastDeploy - Baidu's high-performance large model reasoning and deployment tool
FastDeploy is a high-performance reasoning and deployment tool from Baidu, designed for Large Language Models (LLMs) and Visual Language Models (VLMs).FastDeploy is developed based on the Flying Paddle (PaddlePaddle) framework, and supports a variety of hardware platforms...