Genie 3 - A Universal World Model from Google
Genie 3 is a next-generation universal world model from Google DeepMind that enables the generation of highly dynamic and coherent virtual worlds in real time.Genie 3 simulates physical phenomena, natural ecosystems, and supports the creation of fantasy and historical scenarios. With text prompts, users can...
Claude Opus 4.1 - The Most Powerful Programming Model from Anthropic
Claude Opus 4.1 is a state-of-the-art large-scale language model from Anthropic, designed for efficient processing of complex tasks. The model excels in the programming domain, generating high-quality code, supporting up to 32k of single output, and adapting to a wide range of programming styles...
gpt-oss - a family of open source inference models from OpenAI
gpt-oss is a family of open source inference models from OpenAI that enable efficient, flexible, and easy-to-deploy AI solutions for developers. gpt-oss consists of two versions, gpt-oss-120B with 117 billion parameters and support for 8...
MiDashengLM - Xiaomi's open source sound understanding model
MiDashengLM is Xiaomi's open source large model for efficient sound understanding, with specific parameter version MiDashengLM-7B , focusing on audio processing and understanding. The model is based on Xiaomi Dasheng audio encoder and Qwen2.5-Omn...
MOSS-TTSD - Tsinghua Lab's open source speech generation model for bilingual dialogs
MOSS-TTSD is an open source spoken dialog speech generation model developed by the Speech and Language Laboratory of Tsinghua University. MOSS-TTSD can convert text dialog scripts into natural, smooth and expressive conversational speech, and supports bilingual generation in English and Chinese.
AudioGen-Omni - Multimodal Audio Generation Model from Racer
AudioGen-Omni is a multimodal audio generation model from Racer that generates high-quality audio, speech, and songs based on inputs such as video, text, etc.AudioGen-Omni is based on advanced techniques such as multimodal diffusionTransformer and phase-aligned...
RedOne - the latest social mega-model from Little Red Book
RedOne is a large language model customized for social networks introduced by Little Red Book. The model is trained through a three-stage training strategy that incorporates social and cultural knowledge, strengthens multitasking capabilities, and aligns human preferences.RedOne significantly outperforms the base model in social task performance, in harmful content detection and browsing...
FastDeploy - Baidu's high-performance large model reasoning and deployment tool
FastDeploy is a high-performance reasoning and deployment tool from Baidu, designed for Large Language Models (LLMs) and Visual Language Models (VLMs).FastDeploy is developed based on the Flying Paddle (PaddlePaddle) framework, and supports a variety of hardware platforms...
InteriorGS - 3D Gaussian Semantic Dataset launched by Qunar Technologies
InteriorGS is a high-quality 3D Gaussian semantic dataset introduced by Qunar Technology. The dataset contains 1,000 3D scenes covering more than 80 indoor environments such as homes, convenience stores, wedding halls and museums. The dataset has more than 554,000 object instances in 755 categories...
DragonV2.1 - Zero-Sample Speech Synthesis Model from Microsoft
DragonV2.1 is an advanced zero-sample text-to-speech (TTS) model from Microsoft. Based on the Transformer architecture, the model supports multi-language and zero-sample speech cloning, and generates natural, expressive speech with only 5-90 seconds of voice prompts.