AI Sharing Circle

Daily sharing of the latest AI products, projects, frameworks, paper interpretations, etc.~
Genie 3 - 谷歌推出的通用世界模型

Genie 3 - A Universal World Model from Google

Genie 3 is a next-generation universal world model from Google DeepMind that enables the generation of highly dynamic and coherent virtual worlds in real time.Genie 3 simulates physical phenomena, natural ecosystems, and supports the creation of fantasy and historical scenarios. With text prompts, users can...
3dys ago
0547
Claude Opus 4.1 - Anthropic推出的最强编程模型

Claude Opus 4.1 - The Most Powerful Programming Model from Anthropic

Claude Opus 4.1 is a state-of-the-art large-scale language model from Anthropic, designed for efficient processing of complex tasks. The model excels in the programming domain, generating high-quality code, supporting up to 32k of single output, and adapting to a wide range of programming styles...
3dys ago
0652
gpt-oss - OpenAI推出的开源推理模型系列

gpt-oss - a family of open source inference models from OpenAI

gpt-oss is a family of open source inference models from OpenAI that enable efficient, flexible, and easy-to-deploy AI solutions for developers. gpt-oss consists of two versions, gpt-oss-120B with 117 billion parameters and support for 8...
3dys ago
0392
MiDashengLM - 小米开源的声音理解模型

MiDashengLM - Xiaomi's open source sound understanding model

MiDashengLM is Xiaomi's open source large model for efficient sound understanding, with specific parameter version MiDashengLM-7B , focusing on audio processing and understanding. The model is based on Xiaomi Dasheng audio encoder and Qwen2.5-Omn...
4dys ago
0467
MOSS-TTSD - 清华实验室开源的双语对话语音生成模型

MOSS-TTSD - Tsinghua Lab's open source speech generation model for bilingual dialogs

MOSS-TTSD is an open source spoken dialog speech generation model developed by the Speech and Language Laboratory of Tsinghua University. MOSS-TTSD can convert text dialog scripts into natural, smooth and expressive conversational speech, and supports bilingual generation in English and Chinese.
4dys ago
0437
AudioGen-Omni - 快手推出的多模态音频生成模型

AudioGen-Omni - Multimodal Audio Generation Model from Racer

AudioGen-Omni is a multimodal audio generation model from Racer that generates high-quality audio, speech, and songs based on inputs such as video, text, etc.AudioGen-Omni is based on advanced techniques such as multimodal diffusionTransformer and phase-aligned...
4dys ago
0542
RedOne - 小红书最新推出的社交大模型

RedOne - the latest social mega-model from Little Red Book

RedOne is a large language model customized for social networks introduced by Little Red Book. The model is trained through a three-stage training strategy that incorporates social and cultural knowledge, strengthens multitasking capabilities, and aligns human preferences.RedOne significantly outperforms the base model in social task performance, in harmful content detection and browsing...
5dys ago
01.6K
FastDeploy - 百度推出的高性能大模型推理与部署工具

FastDeploy - Baidu's high-performance large model reasoning and deployment tool

FastDeploy is a high-performance reasoning and deployment tool from Baidu, designed for Large Language Models (LLMs) and Visual Language Models (VLMs).FastDeploy is developed based on the Flying Paddle (PaddlePaddle) framework, and supports a variety of hardware platforms...
5dys ago
0962
InteriorGS - 群核科技推出的3D高斯语义数据集

InteriorGS - 3D Gaussian Semantic Dataset launched by Qunar Technologies

InteriorGS is a high-quality 3D Gaussian semantic dataset introduced by Qunar Technology. The dataset contains 1,000 3D scenes covering more than 80 indoor environments such as homes, convenience stores, wedding halls and museums. The dataset has more than 554,000 object instances in 755 categories...
5dys ago
0527
DragonV2.1 - 微软推出的零样本语音合成模型

DragonV2.1 - Zero-Sample Speech Synthesis Model from Microsoft

DragonV2.1 is an advanced zero-sample text-to-speech (TTS) model from Microsoft. Based on the Transformer architecture, the model supports multi-language and zero-sample speech cloning, and generates natural, expressive speech with only 5-90 seconds of voice prompts.
5dys ago
0877