AI Sharing Circle

AI is changing the world!
VTP - MiniMax海螺视频团队开源的视觉生成模型技术

VTP - MiniMax Conch Video Team's Open Source Visual Generative Modeling Technology

VTP (Visual Tokenizer Pre-training) is a key technology for visual generative models proposed by MiniMax Conch Video team, which enhances the performance of generative systems by improving the pre-training method of visual tokenizer (tokenizer). The traditional method...
3mos ago
030.6K
T5Gemma 2 - 谷歌开源的新一代编码器-解码器模型

T5Gemma 2 - Google's open source next generation encoder-decoder model

T5Gemma 2 is a new generation encoder-decoder model open-sourced by Google, based on the Gemma 3 architecture upgraded with multimodal and long context processing capabilities. It supports a wide range of data types, including text and images, and is capable of handling very long contexts (up to 128K) in generating...
3mos ago
029.5K
FunctionGemma - 谷歌开源专为函数调用优化的轻量级AI模型

FunctionGemma - Google open source lightweight AI model optimized for function calls

FunctionGemma is a lightweight AI model optimized for function calls launched by Google, developed based on the Gemma 3 base model with 270 million parameters, which can convert natural language into executable API instructions in real time on cell phones, browsers and other devices. The core feature is support for local off...
3mos ago
028.8K
SHARP - 苹果开源的单目视图3D场景合成技术

SHARP - Apple's open source monocular view 3D scene synthesis technology

SHARP (Sharp Monocular View Synthesis in Less Than a Second) is Apple's open source monocular view synthesis technology. It can quickly generate a realistic 3D representation of a scene from a single photo in less than a second...
3mos ago
032.6K
TRELLIS.2 - 微软开源的大型3D生成模型

TRELLIS.2 - Microsoft Open Source Large Scale 3D Generative Modeling

TRELLIS.2 is a Microsoft open source large-scale 3D generative model , with 4 billion parameters , focusing on high-fidelity image to 3D generation . Using the innovative "O-Voxel" sparse voxel structure , can efficiently handle complex topology and sharp features , to generate high-quality 3D information with full PBR material ...
3mos ago
039K
Step-GUI - 阶跃星辰开源的AI Agent系列模型

Step-GUI - Step-Star Open Source AI Agent Series Models

Step-GUI is Step-Star's open source AI Agent series of models, including the cloud model Step-GUI, the first MCP protocol for GUI Agents, and the industry's first open source end-side model Step-GUI Edge to support cell phone deployment.Specialized...
4mos ago
037K
A2UI - 谷歌开源的Agent驱动型用户交互界面声明式协议

A2UI - Google's open source declarative protocol for Agent-driven user interaction interfaces

A2UI (Agent-to-User Interface) is Google's open-source Agent-driven interface protocol that solves the problem of generating complex interactive interfaces for AI agents. Through a declarative JSON format that allows AI agents to describe the structure of the user interface , client applications ...
4mos ago
043.5K
SAM Audio - Meta推出的开源多模态音频分割模型

SAM Audio - Open Source Multimodal Audio Segmentation Model from Meta

SAM Audio is an open source multimodal audio segmentation model introduced by Meta to accurately separate arbitrary target sounds from complex audio mixes. By combining textual, visual, and temporal dimensional cues, it enables flexible and efficient audio processing for tasks such as audio editing, denoising, sound extraction, and...
4mos ago
031.8K
混元世界模型1.5 - 腾讯混元开源的实时世界模型生成框架

Mixed World Model 1.5 - Tencent Mixed Open Source Real-time World Model Generation Framework

Mixed World Model 1.5 (Tencent HY WorldPlay) is the industry's first open source real-time world modeling framework released by Tencent, covering the entire chain of data, training, and streaming inference deployment. The core is the WorldPlay autoregressive diffusion model, which uses Next-F...
4mos ago
032.4K
Molmo 2 - Ai2开源的多模态视频图像理解模型系列

Molmo 2 - Ai2 open source multimodal video image understanding model series

Molmo 2 is an open source multimodal model released by the Allen Institute for AI (Ai2) to improve video and multi-image understanding. Three variants are included; Molmo 2 (8B), Molmo 2 (4B) and Molmo 2-O...
4mos ago
037.5K