AI Sharing Circle

Daily sharing of the latest AI products, projects, frameworks, paper interpretations, etc.~
Skywork UniPic 2.0 - 昆仑万维开源的高效多模态模型

Skywork UniPic 2.0 - Open Source Efficient Multi-Modal Modeling by KunlunWanwei

Skywork UniPic 2.0 is an efficient multimodal model open-sourced by KunlunWei, focusing on image generation, editing and understanding. The model is based on a 2B-parameter SD3.5-Medium architecture, which is realized through pre-training, progressive dual-task reinforcement strategies and co-training...
8mos ago
045.1K
RynnRCP - 阿里达摩院推出的首个开源机器人上下文协议

RynnRCP - First Open Source Robotics Context Protocol from Ali Dharma Institute

RynnRCP is an open source Robot Context Protocol (RCP) from Ali Dharma Institute that lowers the threshold for development of embodied intelligence and opens up the entire development process.RynnRCP consists of the RCP framework and the RobotMotion module.The RCP framework, through capability abstraction and multi-protocol support, will...
8mos ago
050.2K
RynnEC - 阿里达摩院开源的世界理解模型

RynnEC - Ali Dharma Institute's open source world understanding model

RynnEC is a world understanding model introduced by Alibaba Dharma Institute, focusing on embodied intelligence tasks. The model is based on multimodal fusion technology, combining video data and natural language, and can parse objects in a scene from multiple dimensions, supporting functions such as object understanding, spatial perception and video target segmentation.
8mos ago
051K
Matrix-3D - 昆仑万维开源的3D世界生成框架

Matrix-3D - Kunlun World Wide open source 3D world generation framework

Matrix-3D is an open source framework from Skywork AI team, focusing on generating explorable panoramic 3D worlds. The framework combines panoramic video generation and 3D reconstruction techniques to generate high-quality, omni-directional explorable 3D worlds from a single image or text prompt...
8mos ago
051.6K
GLM-4.5V - 智谱推出的多模态开源视觉推理模型

GLM-4.5V - Multimodal Open Source Visual Reasoning Model by Smart Spectrum

GLM-4.5V is the world's leading open source visual inference model introduced by Smart Spectrum, with 106 billion total parameters and 12 billion activated parameters. The model is trained based on the new generation text base model GLM-4.5-Air, with powerful visual understanding and reasoning capabilities, capable of handling images, video...
8mos ago
050.7K
Genie 3 - 谷歌推出的通用世界模型

Genie 3 - A Universal World Model from Google

Genie 3 is a next-generation universal world model from Google DeepMind that enables the generation of highly dynamic and coherent virtual worlds in real time.Genie 3 simulates physical phenomena, natural ecosystems, and supports the creation of fantasy and historical scenarios. With text prompts, users can...
8mos ago
045.2K
Claude Opus 4.1 - Anthropic推出的最强编程模型

Claude Opus 4.1 - The Most Powerful Programming Model from Anthropic

Claude Opus 4.1 is a state-of-the-art large-scale language model from Anthropic, designed for efficient processing of complex tasks. The model excels in the programming domain, generating high-quality code, supporting up to 32k of single output, and adapting to a wide range of programming styles...
8mos ago
045.1K
gpt-oss - OpenAI推出的开源推理模型系列

gpt-oss - a family of open source inference models from OpenAI

gpt-oss is a family of open source inference models from OpenAI that enable efficient, flexible, and easy-to-deploy AI solutions for developers. gpt-oss consists of two versions, gpt-oss-120B with 117 billion parameters and support for 8...
8mos ago
043K
MiDashengLM - 小米开源的声音理解模型

MiDashengLM - Xiaomi's open source sound understanding model

MiDashengLM is Xiaomi's open source large model for efficient sound understanding, with specific parameter version MiDashengLM-7B , focusing on audio processing and understanding. The model is based on Xiaomi Dasheng audio encoder and Qwen2.5-Omn...
8mos ago
045K
MOSS-TTSD - 清华实验室开源的双语对话语音生成模型

MOSS-TTSD - Tsinghua Lab's open source speech generation model for bilingual dialogs

MOSS-TTSD is an open source spoken dialog speech generation model developed by the Speech and Language Laboratory of Tsinghua University. MOSS-TTSD can convert text dialog scripts into natural, smooth and expressive conversational speech, and supports bilingual generation in English and Chinese.
8mos ago
047.8K