AI Sharing Circle

AI is changing the world!
PartCrafter - 北大联合字节开源的单图3D生成模型

PartCrafter - NU United Bytes open source single figure 3D generated models

PartCrafter is an advanced 3D generative model, jointly proposed by Peking University, ByteDance and Carnegie Mellon University. It can generate multiple semantically explicit and geometrically diverse 3D mesh parts from a single RGB image at once. The models are modeled through a combinatorial potential space and...
4mos ago
026.1K
GigaWorld-0 - 极佳视界开源的世界模型框架

GigaWorld-0 - GigaVision open source world modeling framework

GigaWorld-0 is the open source world modeling framework of domestic Embodied Intelligence startup GigaAI, mainly used to solve the data bottleneck problem in the field of Embodied Intelligence (Embodied AI). Efficiently generating high-quality, diverse and physically realistic training data, push...
4mos ago
025K
Mistral 3 - Mistral AI发布开源的最新多模态大模型系列

Mistral 3 - Mistral AI Releases Open Source's Newest Series of Multimodal Large Models

Mistral 3 is the latest multimodal large model series released as open source by Mistral AI, including the flagship model Mistral Large 3 (675B total parameters) and the lighter version of the Ministral series (3B/8B/14B), both supporting image understanding...
4mos ago
023.2K
Vidi2 - 字节跳动开源的多模态视频理解与生成大模型

Vidi2 - ByteHop's open source multimodal video understanding and generation of large models

Vidi2 is a second-generation multimodal video understanding and generation big model open-sourced by ByteDance, focusing on video content understanding, analysis and creation. It supports joint input of text, video, and audio modalities, and can simultaneously understand picture content, sound information, and natural language commands to achieve cross-modal interaction and push...
4mos ago
027K
Alpamayo-R1 - 英伟达开源的带推理能力的视觉-语言-行动模型

Alpamayo-R1 - NVIDIA's Open Source Vision-Language-Action Model with Reasoning Capabilities

Alpamayo-R1 is a NVIDIA-developed Vision-Language-Action (VLA) model with reasoning capability, designed to enhance the decision-making capability of autonomous driving in complex scenarios. By introducing a causal chain reasoning mechanism, the vehicle is able to analyze scene causality (e.g., "cause before...
4mos ago
034.9K
Ovis-Image - 阿里AIDC-AI团队开源的文生图模型

Ovis-Image - Ali AIDC-AI team's open source Vincentian graph model

Ovis-Image is a 7 billion parameter text-generated graph model open-sourced by the AIDC-AI team of Alibaba International Digital Commerce Group, focusing on high-quality text rendering. Based on Ovis-U1 architecture, it inherits the advanced visual decoder and bi-directional Token refiner ...
4mos ago
022.9K
悟界·Emu3.5 - 智源研究院开源的多模态世界大模型

Wujie-Emu3.5 - Wisdom Source Research Institute open source multimodal world big model

Wujie-Emu3.5 is an open source multimodal world grand model from Beijing Zhiyuan Artificial Intelligence Research Institute, with 34 billion references and native world modeling capability. Trained by 10 trillion multimodal Token (including 790 years of video data), it can simulate the laws of physics and realize graphic generation, visual guidance...
4mos ago
026.5K
GELab-Zero - 阶跃团队开源的端侧多模态GUI Agent模型

GELab-Zero - Open source end-side multimodal GUI Agent model by Steps team

GELab-Zero is an open source end-side multimodal GUI Agent model by Step Leap Team , built on Qwen3-VL-4B-Instruct base model with 4B parameters.It can recognize UI elements and perform operations such as clicking and sliding, and supports cross-application tasking ...
4mos ago
033.9K
Depth Anything 3 - 字节跳动Seed开源的3D视觉重建模型

Depth Anything 3 - 3D Visual Reconstruction Models for ByteHop Seed Open Source

Depth Anything 3 (DA3) is a 3D visual reconstruction model developed and open-sourced by the Byte Jump Seed team. Through a single Transformer architecture to realize the spatial geometry of any viewpoint reconstruction, only need to predict the depth map and ray map can restore the three-dimensional scene, compared to...
4mos ago
035.3K
DeepSeek-Math-V2 - DeepSeek开源的数学推理模型

DeepSeek-Math-V2 - DeepSeek open source mathematical reasoning model

DeepSeek-Math-V2 is an open source mathematical reasoning model by DeepSeek, an AI company under Phantom Cube, and the latest version is based on DeepSeek-V3.2-Exp-Base improvement, with performance surpassing that of Gemini DeepThink to reach the international number...
4mos ago
028.3K