AI Sharing Circle

AI is changing the world!
VoxCPM - 面壁智能联合清华开源的端到端TTS模型

VoxCPM - Faceted Intelligence and Tsinghua Open Source End-to-End TTS Model

VoxCPM is a speech generation model jointly open-sourced by Facade Intelligence and Shenzhen International Graduate School of Tsinghua University.VoxCPM adopts an end-to-end diffusion autoregressive architecture to generate continuous speech representations directly from text, breaking through the limitations of traditional discrete disambiguation. Through hierarchical language modeling and finite state quantization...
3wks ago
014K
InternVLA·N1 - 上海AI Lab开源的端到端双系统导航大模型

InternVLA-N1 - Shanghai AI Lab Open Source End-to-End Dual System Navigation Large Model

InternVLA-N1 is an open source end-to-end dual-system navigation macromodel from Shanghai Artificial Intelligence Laboratory. Using a dual-system architecture, System 2 is responsible for understanding linguistic commands and planning long-range paths, while System 1 focuses on high-frequency response and agile obstacle avoidance. The model is trained entirely based on synthetic data through large-scale digital ...
3wks ago
09.8K
VLAC - 上海AI Lab开源的具身奖励大模型

VLAC - Shanghai AI Lab's Open Source Large Model of Embodied Reward

VLAC is an open source embodied reward macromodel from Shanghai Artificial Intelligence Laboratory. Based on InternVL multimodal macromodel, it integrates Internet video data and robot operation data to provide process reward and task completion estimation for robot reinforcement learning in the real world.VLAC can effectively ...
3wks ago
09.4K
InternVLA·M1 - 上海AI Lab开源的具身双系统操作“大脑”

InternVLA-M1 - Shanghai AI Lab's Open Source Embodied Dual System Operation "Brain"

InternVLA-M1 is an open-source embodied operating "brain" of Shanghai Artificial Intelligence Laboratory, which is a large model of two-system operation oriented to instruction following. It builds a complete closed loop covering "think-act-learn" and is responsible for high-level spatial reasoning and task planning. The model adopts a two-phase training cur...
4wks ago
010.7K
PromptEnhancer - 腾讯混元开源的AI提示词增强工具

PromptEnhancer - Tencent Mixed Meta Open Source AI Prompt Word Enhancement Tool

PromptEnhancer is an open source prompt word enhancement tool from Tencent's Mixed Meta team to improve the generation of text-to-image (Text-to-Image, T2I) models. Through the chain of reasoning (Chain-of-Thought, CoT) approach to the use of ...
4wks ago
010K
UnifoLM-WMA-0 - 宇树科技开源的世界模型动作架构

UnifoLM-WMA-0 - Yu Shu Technology open source world model action architecture

UnifoLM-WMA-0 is an open source world model-action architecture across multiple classes of robot ontologies by Yu Shu Technology, designed for general robot learning. Composed of a world model and an action architecture, the world model understands the physical laws of robot-environment interaction, and the action architecture is responsible for specific...
4wks ago
011.6K
InfiniteTalk - 美团视觉AI开源的音频驱动视频生成工具

InfiniteTalk - Open Source Audio-Driven Video Generation Tool for Mission Vision AI

InfiniteTalk is an audio-driven video generation tool developed by the MeiGen-AI team that generates talking videos of unlimited length based on the input audio. The core advantage lies in the precise lip synchronization technology, which can perfectly match the audio with the character's mouth shape to generate natural and smooth...
4wks ago
015.1K
ROMA - 开源的元Agent框架,自动分解复杂任务并行处理

ROMA - Open Source Meta-Agent Framework for Automatic Decomposition of Complex Tasks for Parallel Processing

ROMA (Recursive-Open-Meta-Agent) is an open source meta-agent framework developed by Sentient AGI to efficiently solve complex problems through recursive task decomposition and parallel processing. Support for Python 3.12+, Docker and ...
4wks ago
012.6K
Lumina-DiMOO - 上海AI Lab联合华为昇腾开源的多模态大模型

Lumina-DiMOO - A Multimodal Large Model Open-Sourced by Shanghai AI Lab and Huawei Ascendant

Lumina-DiMOO is a new generation of unified model for multimodal generation and understanding launched by Shanghai Artificial Intelligence Laboratory (SAL) in conjunction with Huawei Rise at the World Artificial Intelligence Conference 2025. Based on the Rise AI basic hardware and software platform and the MindSpeed MM multimodal large model suite, it accomplishes...
4wks ago
09.4K
Hyprnote - 开源的本地优先AI会议笔记工具

Hyprnote - Open source, locally prioritized AI conference note-taking tool

Hyprnote is an open source, local-first AI meeting note-taking tool designed for professionals to protect user privacy and improve meeting efficiency. Adopting the "local first" principle, all data storage and processing is done on the user's local device to ensure data security and support offline operation.
4wks ago
08.6K