Featured AI Tools List | page 5 | AI Sharing Circle

RoboBrain-X0 - 智源研究院开源的零样本跨本体泛化具身模型

RoboBrain-X0 - Wisdom Source Research Institute open source zero-sample cross ontology generalized embodiment model

RoboBrain-X0 is the world's first open source embodied model that supports zero-sample cross-ontology generalization open-sourced by Wisdom Source Research Institute, which is of great industrial significance. It can drive multiple real robots of different configurations to complete basic operation tasks without fine-tuning, and after a small amount of sample fine-tuning, it demonstrates the ability to replicate ...

Latest AI Resources

9mos ago

042.6K

扩散模型（Diffusion Model）是什么，一文看懂

Diffusion Model (Diffusion Model) what is it, an article to read and understand

Diffusion Model (Diffusion Model) is a generative model specialized for creating new data samples such as images, audio or text. The core of the model is inspired by the process of diffusion in physics, which simulates the natural diffusion of particles from a region of high concentration to a region of low concentration. In the machine...

9mos ago

052.2K

模型微调（Fine-tuning）是什么，一文看懂

What is Fine-tuning, in one article?

Model fine-tuning (Fine-tuning) is a specific implementation of transfer learning in machine learning. The core process is based on pre-trained models, which utilize large-scale datasets to learn generic patterns and develop extensive feature extraction capabilities. The fine-tuning phase then introduces task-specific datasets to ...

9mos ago

043.4K

Lynx - 字节跳动开源的高保真视频生成模型

Lynx - ByteHop's open source high-fidelity video generation model

Lynx is a high-fidelity personalized video generation model open-sourced by ByteDance that can generate identity-consistent videos with only a single portrait photo. Built on the diffusion Transformer (DiT) base model , the introduction of ID-adapter and Ref-adapte...

Latest AI Resources

9mos ago

045.1K

Claude Sonnet 4.5 - Anthropic推出的最强AI编程模型

Claude Sonnet 4.5 - The Most Powerful AI Programming Model from Anthropic

Claude Sonnet 4.5 is an artificial intelligence model from Anthropic designed for programming, computer operations, and complex task automation. The model excels in code generation, long-duration task processing, reasoning, and mathematical computation, supporting everything from initial planning...

Latest AI Resources

9mos ago

049.8K

DeepSeek-V3.2-Exp - DeepSeek最新开源的实验性AI模型

DeepSeek-V3.2-Exp - DeepSeek's latest open source experimental AI model

DeepSeek-V3.2-Exp is a DeepSeek open source experimental AI model that significantly improves the efficiency of long text processing by introducing the DeepSeek Sparse Attention (DSA) mechanism. The model is based on DeepSeek...

Latest AI Resources

9mos ago

045.4K

HunyuanImage 3.0 - 腾讯开源的免费多模态图像生成模型

HunyuanImage 3.0 - Tencent open source free multimodal image generation model

HunyuanImage 3.0 (HunyuanImage 3.0) is a native multimodal image generation model released and open-sourced by Tencent. The model parameter size of 80B, is currently the best evaluation results, the largest number of parameters of the open source image generation model. Hybrid Image 3.0 supports real-time image generation, users can side...

Latest AI Resources

9mos ago

056.1K

Hunyuan3D-Part - 腾讯开源的免费3D组件生成模型

Hunyuan3D-Part - Tencent open source free 3D components to generate models

Hunyuan3D-Part (Hybrid 3D-Part) is a 3D generation model released and open-sourced by Tencent. Composed of P3 - SAM and X - Part, it realizes high-precision and controllable component-based 3D generation for the first time, and supports 50 + components to be generated automatically. Users can first use...

Latest AI Resources

9mos ago

059.4K

AudioFly - 科大讯飞开源的文本生成音效AI模型

AudioFly - KU Xunfei open source text generation sound AI models

AudioFly is KDDI open source AI model for text to generate sound effects. Based on the potential diffusion model architecture, with 1 billion parameters, trained on large-scale, diverse audio text datasets, covering AudioSet, AudioCaps, TUT and other public datasets and internal...

Latest AI Resources

9mos ago

054.1K

Hunyuan3D-Omni - 腾讯混元开源的3D模型生成框架

Hunyuan3D-Omni - Tencent Mixed-Year Open Source 3D Model Generation Framework

Hunyuan3D-Omni (Hybrid 3D-Omni) is an open source 3D asset generation framework by Tencent's Hybrid 3D team, which realizes accurate 3D model generation through multiple control signals. Based on Hunyuan3D 2.1 architecture, it introduces a unified control encoder that can handle point...

Latest AI Resources

9mos ago

054.2K

FLM-Audio - 智源联合南洋理工开源的全双工音频对话模型

FLM-Audio - Wisdom Source and Nanyang Polytechnic Open Source Full-Duplex Audio Dialog Modeling

FLM-Audio is a native full-duplex audio dialog grand model released by Beijing Zhiyuan Artificial Intelligence Research Institute in conjunction with Spin Matrix and Nanyang Technological University of Singapore, supporting both Chinese and English. Adopting native full-duplex architecture, it can merge listening, speaking and monologue at each time step...

Latest AI Resources

9mos ago

048.5K

注意力机制（Attention Mechanism）是什么，一文看懂

Attention Mechanism (Attention Mechanism) is what, an article to read and understand

Attention Mechanism (Attention Mechanism) is a computational technique that mimics human cognitive processes, initially applied in the field of machine translation, and later becoming an important part of deep learning.

9mos ago

049.8K

Transformer 架构（Transformer Architecture）是什么，一文看懂

What is the Transformer Architecture in one article?

The Transformer architecture is a deep learning model designed for processing sequence-to-sequence tasks such as machine translation or text summarization. The core innovation is the complete reliance on self-attention mechanisms, eschewing traditional loops or convolutional structures. Allowing the model to process all elements of a sequence in parallel, large...

9mos ago

047.9K

预训练模型（Pre-trained Model）是什么，一文看懂

What is Pre-trained Model (Pre-trained Model), an article to read and understand

Pre-trained Model is a fundamental and powerful technique in the field of Artificial Intelligence, representing machine learning models that are pre-trained on large-scale datasets. Models form a broad knowledge base by processing massive amounts of information and learning generalized patterns and features from the data...

9mos ago

045.7K

大语言模型（Large Language Model）是什么，一文看懂

What is the Large Language Model (LLM) in one article?

Large Language Model (LLM) is a deep learning system trained on massive text data, with the Transformer architecture at its core. The self-attention mechanism of this architecture can effectively capture long-distance dependencies in language. The model's "large ...

9mos ago

045.6K

长短期记忆网络（Long Short-Term Memory）是什么，一文看懂

What is Long Short-Term Memory (LSTM) network, an article to read and understand

Long Short-Term Memory (LSTM) is a recurrent neural network variant specialized in processing sequence data. In the field of artificial intelligence, sequence data is widely used in tasks such as time series prediction, natural language processing and speech recognition.

9mos ago

040.2K

CWM - Meta FAIR开源的代码世界语言模型

CWM - Meta FAIR open source code world language model

CWM (Code World Model) is a 32-billion-parameter open-source world language model released by the Meta FAIR team, designed for code generation and reasoning. Introducing the concept of "world model", it can simulate the code execution process, predict the variable state changes, and advance...

Latest AI Resources

9mos ago

044.6K

Neovate Code - 蚂蚁开源的智能编程助手

Neovate Code - Ant Open Source's Intelligent Programming Assistant

Neovate Code is an open source intelligent programming assistant from Ant Group's Alipay Experience Technology Department, which improves development efficiency through artificial intelligence technology. With conversational development features, developers can describe the requirements through natural language, Neovate Code can understand and generate the corresponding generation...

Latest AI Resources

9mos ago

047.1K

Audio2Face - NVIDIA开源的AI 3D面部动画生成模型

Audio2Face - NVIDIA open source AI 3D facial animation generation model

Audio2Face is NVIDIA's open source AI tool capable of transforming audio input into realistic 3D facial animation. By analyzing speech features in the audio, such as phonemes and intonation, it generates precise lip synchronization and subtle emotional expressions to give vivid human expressions to virtual characters.

Latest AI Resources

9mos ago

049K

Qwen3-VL - 阿里云通义千问开源的多模态视觉语言大模型

Qwen3-VL - AliCloud Tongyi Qianqian open source multimodal visual language big model

Qwen3-VL is an open source multimodal visual language large model by AliCloud Tongyi Qianqian team, the number of references reaches 235 billion, and the model file is about 471GB.Containing instruction version and thinking version, it adopts enhanced MRope interleaved layout, DeepStack and other technologies, which can effectively utilize the visual transform...

Latest AI Resources

9mos ago

064.2K

Qwen3Guard - 阿里Qwen开源的安全模型

Qwen3Guard - Ali Qwen open source security model

Qwen3Guard is a fine-tuned security protection model based on the Qwen3 base model, designed for security detection. It provides accurate security categorization of prompts and responses, provides risk levels, and supports English, Chinese, and multi-language environments.Qwen3Guard comes with two pro...

Latest AI Resources

9mos ago

052.1K

Qwen3-TTS-Flash - 阿里通义推出的语音合成模型

Qwen3-TTS-Flash - Speech Synthesis Models by Ali Tongyi

Qwen3-TTS-Flash is an advanced speech synthesis model introduced by Ali Tongyi, supporting 17 tones and 10 languages, covering Mandarin, English, dialects, etc. It has excellent stability and high expressiveness of Chinese and English speech, and the model can automatically adjust the tone of voice to make it more vivid.

Latest AI Resources

9mos ago

061.7K

Qwen3-Omni - 阿里通义推出的全模态AI模型

Qwen3-Omni - Omnimodal AI model launched by Ali Tongyi

Qwen3-Omni is a fully modal AI model introduced by the Ali Tongyi team that can handle multiple data types such as text, images, audio and video, and supports text interaction in 119 languages with low latency and high controllability.

Latest AI Resources

9mos ago

048.3K

DeepSeek-V3.1-Terminus - DeepSeek推出的最新版AI模型

DeepSeek-V3.1-Terminus - The latest version of the AI model introduced by DeepSeek

DeepSeek-V3.1-Terminus is an upgraded version of DeepSeek-V3.1, an artificial intelligence language model from the DeepSeek team. The model is optimized in terms of language consistency, code generation, and search capabilities to more accurately...

Latest AI Resources

9mos ago

045.1K

联邦学习（Federated Learning）是什么，一文看懂

What is Federated Learning (FL) in one article?

Federated Learning (FL) is an innovative machine learning approach first proposed by a Google research team in 2016 to address challenges in data privacy and distributed computing.

9mos ago

046K

Granite-Docling-258M - IBM开源的视觉语言模型

Granite-Docling-258M - IBM Open Source Visual Language Modeling

Granite-Docling-258M is an ultra-compact open source visual language model from IBM designed for efficient document conversion. The model converts documents into machine-readable formats while leaving layout, tables, formulas, and other elements intact.

Latest AI Resources

9mos ago

042.9K

Lucy Edit - 开源的AI视频编辑工具，自然语言描述编辑

Lucy Edit - open source AI video editing tool, natural language description editing

Lucy Edit is an open source AI video editing tool developed by Decart AI. Allows users to edit video through simple natural language descriptions, such as "change the character into a polar bear" or "turn the scene into a 2D cartoon style", without the need for complex fine-tuning or the use of masks ...

Latest AI Resources

9mos ago

053.6K

LongCat-Flash-Thinking - 美团开源的高效推理模型

LongCat-Flash-Thinking - An Efficient Reasoning Model for Meituan Open Source

LongCat-Flash-Thinking is a highly efficient reasoning model released by the LongCat team at Mission LongCat that has become more powerful and specialized while maintaining the extreme speed of LongCat-Flash-Chat. The model is based on logic, math, code, intelligence...

Latest AI Resources

9mos ago

041.9K

Ling-V2 - 蚂蚁百灵开源的MoE架构语言模型系列

Ling-V2 - The MoE Architecture Language Model Series of Ant Centurion Open Source

Ling-V2 is a family of large-scale language models based on the MoE architecture introduced by the Ant-Belling team. The first version, Ling-mini-2.0, has 16 billion total parameters, with only 1.4 billion parameters activated per input token.

Latest AI Resources

9mos ago

043.1K

Kronos - 清华和微软联合开源的金融K线图基础模型

Kronos - Tsinghua and Microsoft joint open source financial K chart base model

Kronos is the first K-line chart base model for financial markets jointly open-sourced by Tsinghua University and Microsoft Research Asia. It analyzes K-line data of stocks, cryptocurrencies and other assets, including opening, high, low, closing and volume, to predict future price movements.

Latest AI Resources

9mos ago

070.5K

Wan2.2-Animate - 通义万相开源的动作生成模型

Wan2.2-Animate - A Generative Model for Action Generation of the Tongyi Wanphase Open Source

Wan2.2-Animate is an open source action generation model , support for action imitation and role-playing mode . Users only need to input a character picture and a reference video , the model can migrate the video character's movements and expressions to the picture character , giving the picture character dynamic expression ...

Latest AI Resources

9mos ago

045.4K

Xiaomi-MiMo-Audio - 小米开源的首个原生端到端语音大模型

Xiaomi-MiMo-Audio - Xiaomi Open Source's First Native End-to-End Speech Big Model

Xiaomi-MiMo-Audio is Xiaomi's open source 7-billion-parameter end-to-end speech macromodel with powerful features such as multi-language dialog, speech continuation, less-sample generalization, and audio understanding, which is able to reach the SOTA level in speech intelligence and audio understanding benchmarks, surpassing Google Gemi...

Latest AI Resources

9mos ago

050.4K

InternVLA-A1 - 上海AI Lab开源一体化操作能力的具身大模型

InternVLA-A1 - Shanghai AI Lab Open Source Integration of Operational Capabilities for Embodied Large Models

InternVLA-A1 is a large model of embodied operation open-sourced by Shanghai Artificial Intelligence Laboratory. It has the ability to understand, imagine, and execute the integration, and can accurately complete the task. The model fuses real and simulated operational data, and automates the construction of massive multimodal through large-scale virtual-real hybrid scene assets...

Latest AI Resources

9mos ago

052.7K

VoxCPM - 面壁智能联合清华开源的端到端TTS模型

VoxCPM - Faceted Intelligence and Tsinghua Open Source End-to-End TTS Model

VoxCPM is a speech generation model jointly open-sourced by Facade Intelligence and Shenzhen International Graduate School of Tsinghua University.VoxCPM adopts an end-to-end diffusion autoregressive architecture to generate continuous speech representations directly from text, breaking through the limitations of traditional discrete disambiguation. Through hierarchical language modeling and finite state quantization...

Latest AI Resources

9mos ago

054.5K

InternVLA·N1 - 上海AI Lab开源的端到端双系统导航大模型

InternVLA-N1 - Shanghai AI Lab Open Source End-to-End Dual System Navigation Large Model

InternVLA-N1 is an open source end-to-end dual-system navigation macromodel from Shanghai Artificial Intelligence Laboratory. Using a dual-system architecture, System 2 is responsible for understanding linguistic commands and planning long-range paths, while System 1 focuses on high-frequency response and agile obstacle avoidance. The model is trained entirely based on synthetic data through large-scale digital ...

Latest AI Resources

9mos ago

051.2K

WebWeaver - 阿里通义开源的新型双智能体框架

WebWeaver - Ali Tongyi open source new dual-intelligence body framework

WebWeaver is a new dual-intelligence body framework introduced by Alibaba Tongyi team, which is mainly used in open deep research, and can simulate the human research process, which is divided into two intelligences: planning and writing.

Latest AI Resources

9mos ago

049.7K

MCP Registry - GitHub推出的官方MCP服务器管理平台

MCP Registry - The official MCP server management platform from GitHub.

The MCP Registry is a centralized platform from GitHub that helps developers discover and install MCP servers more easily.The MCP Registry is here to help developers quickly find the AI tools they need in one place, greatly simplifying...

Latest AI Resources

9mos ago

047.3K

VLAC - 上海AI Lab开源的具身奖励大模型

VLAC - Shanghai AI Lab's Open Source Large Model of Embodied Reward

VLAC is an open source embodied reward macromodel from Shanghai Artificial Intelligence Laboratory. Based on InternVL multimodal macromodel, it integrates Internet video data and robot operation data to provide process reward and task completion estimation for robot reinforcement learning in the real world.VLAC can effectively ...

Latest AI Resources

9mos ago

044.2K

通义DeepResearch - 阿里通义开源的深度研究智能体

Tongyi DeepResearch - Ali Tongyi Open Source Deep Research Intelligence Body

Tongyi DeepResearch (Tongyi DeepResearch) is an open source intelligent body launched by Alibaba, designed for deep information retrieval and complex task reasoning, with 30 billion parameters, supporting multiple reasoning modes, including ReAct mode and deep mode...

Latest AI Resources

9mos ago

051.5K

InternVLA·M1 - 上海AI Lab开源的具身双系统操作“大脑”

InternVLA-M1 - Shanghai AI Lab's Open Source Embodied Dual System Operation "Brain"

InternVLA-M1 is an open-source embodied operating "brain" of Shanghai Artificial Intelligence Laboratory, which is a large model of two-system operation oriented to instruction following. It builds a complete closed loop covering "think-act-learn" and is responsible for high-level spatial reasoning and task planning. The model adopts a two-phase training cur...

Latest AI Resources

9mos ago

041.1K

OpenAI《在AI时代保持领先》PDF指南 - 附下载链接

OpenAI's PDF Guide to Staying Ahead in the Age of AI - with Download Links

Staying ahead in the age of AI is an AI leadership guide from OpenAI that helps business leaders maintain a competitive edge in the age of AI. The guide points to the rapid growth of AI, with faster model releases, lower costs, and faster enterprise adoption...

Latest AI Resources Course materials

9mos ago

052.4K

浙江大学免费PDF资料《大模型基础》 - 附下载链接

Free PDF of Fundamentals of Large Models from Zhejiang University - with download link

Fundamentals of Large Models provides an in-depth analysis of the core technologies and practical paths of Large Language Models (LLMs). Starting from the fundamental theory of language modeling, it systematically explains the principles of model design based on statistics, recurrent neural networks (RNN), and Transformer architecture, focusing on the three major big language model...

Latest AI Resources Course materials

9mos ago

053.7K

循环神经网络（Recurrent Neural Network）是什么，一文看懂

What is Recurrent Neural Network (RNN) in one article?

Recurrent Neural Network (RNN) is a neural network architecture designed for processing sequential data. Sequential data refers to a collection of data with temporal order or dependencies, such as linguistic text, speech signals, or time series.

9mos ago

048.7K

神经网络（Neural Network）是什么，一文看懂

What is Neural Network (Neural Network), an article to read and understand

Neural Network (NN) is a computational model inspired by the way neurons work in the biological brain.

9mos ago

041.1K

PromptEnhancer - 腾讯混元开源的AI提示词增强工具

PromptEnhancer - Tencent Mixed Meta Open Source AI Prompt Word Enhancement Tool

PromptEnhancer is an open source prompt word enhancement tool from Tencent's Mixed Meta team to improve the generation of text-to-image (Text-to-Image, T2I) models. Through the chain of reasoning (Chain-of-Thought, CoT) approach to the use of ...

Latest AI Resources

9mos ago

047.3K

LLaSO - 逻辑智能推出的业界首个全面开源的语音模型

LLaSO - The Industry's First Fully Open Source Speech Model from Logic Intelligence

LLaSO is an open source speech model launched by Beijing Depth Logic Intelligence Technology Co. Ltd, which solves the problems of data dispersion and insufficient task coverage in the field of large-scale speech language modeling by integrating speech and text data and providing alignment datasets, command fine-tuning datasets and evaluation benchmarks.

Latest AI Resources

9mos ago

038.3K

混元3D 3.0 - 腾讯推出的3D生成模型，支持超高清建模

Hybrid 3D 3.0 - Tencent's 3D generated models with UHD modeling support

Hybrid 3D 3.0 is an advanced 3D generation model launched by Tencent, based on 3D-DiT hierarchical sculpting technology, with a geometric resolution of up to 1536³, capable of generating ultra-high-definition, detail-rich 3D models, and excelling in character modeling, with the ability to accurately shape the five senses and body shape.

Latest AI Resources

9mos ago

056.8K

UnifoLM-WMA-0 - 宇树科技开源的世界模型动作架构

UnifoLM-WMA-0 - Yu Shu Technology open source world model action architecture

UnifoLM-WMA-0 is an open source world model-action architecture across multiple classes of robot ontologies by Yu Shu Technology, designed for general robot learning. Composed of a world model and an action architecture, the world model understands the physical laws of robot-environment interaction, and the action architecture is responsible for specific...

Latest AI Resources

9mos ago

058.7K

InfiniteTalk - 美团视觉AI开源的音频驱动视频生成工具

InfiniteTalk - Open Source Audio-Driven Video Generation Tool for Mission Vision AI

InfiniteTalk is an audio-driven video generation tool developed by the MeiGen-AI team that generates talking videos of unlimited length based on the input audio. The core advantage lies in the precise lip synchronization technology, which can perfectly match the audio with the character's mouth shape to generate natural and smooth...

Latest AI Resources

9mos ago

069.6K

Mini-o3 - 字节、港大联合开源的视觉推理模型

Mini-o3 - Bytes, HKU Joint Open Source Visual Reasoning Model

Mini-o3 is an open source model jointly launched by ByteDance and the University of Hong Kong, focusing on solving complex visual search problems. The model has a powerful multi-round interactive reasoning capability, and can locate the target through deep exploration and trial-and-error.

Latest AI Resources

9mos ago

043.7K

GPT-5-Codex - OpenAI推出的最强编程模型

GPT-5-Codex - The Most Powerful Programming Model Introduced by OpenAI

GPT-5-Codex is a powerful programming optimization model from OpenAI, further enhanced by GPT-5 and designed for software engineers. The model generates high-quality code quickly, supports multiple programming languages, and optimizes existing code to improve performance.

Latest AI Resources

9mos ago

040.8K

ROMA - 开源的元Agent框架，自动分解复杂任务并行处理

ROMA - Open Source Meta-Agent Framework for Automatic Decomposition of Complex Tasks for Parallel Processing

ROMA (Recursive-Open-Meta-Agent) is an open source meta-agent framework developed by Sentient AGI to efficiently solve complex problems through recursive task decomposition and parallel processing. Support for Python 3.12+, Docker and ...

Latest AI Resources

9mos ago

055.7K

Lumina-DiMOO - 上海AI Lab联合华为昇腾开源的多模态大模型

Lumina-DiMOO - A Multimodal Large Model Open-Sourced by Shanghai AI Lab and Huawei Ascendant

Lumina-DiMOO is a new generation of unified model for multimodal generation and understanding launched by Shanghai Artificial Intelligence Laboratory (SAL) in conjunction with Huawei Rise at the World Artificial Intelligence Conference 2025. Based on the Rise AI basic hardware and software platform and the MindSpeed MM multimodal large model suite, it accomplishes...

Latest AI Resources

9mos ago

049.8K

Hyprnote - 开源的本地优先AI会议笔记工具

Hyprnote - Open source, locally prioritized AI conference note-taking tool

Hyprnote is an open source, local-first AI meeting note-taking tool designed for professionals to protect user privacy and improve meeting efficiency. Adopting the "local first" principle, all data storage and processing is done on the user's local device to ensure data security and support offline operation.

Latest AI Resources

9mos ago

049.5K

MobileLLM-R1 - Meta开源的专项高效推理模型系列

MobileLLM-R1 - Meta open source special efficient inference model series

MobileLLM-R1 is Meta's open source series of efficient inference models designed for mathematical, programming and scientific reasoning. It contains a base model and a final model, with 140 million, 360 million and 950 million parameter versions, respectively. The models are not generic chat models and are supervised fine-tuned (SFT...

Latest AI Resources

9mos ago

041.2K

ERNIE-4.5-21B-A3B-Thinking - 百度开源的推理思考模型

ERNIE-4.5-21B-A3B-Thinking - Baidu open source reasoning thinking model

ERNIE-4.5-21B-A3B-Thinking is Baidu's open source large-scale language model focused on reasoning tasks. Using the Mixed Expert (MoE) architecture , the total number of references to 21 billion , each token activates 3 billion parameters to support 128K long context window ...

Latest AI Resources

9mos ago

037.9K

人工智能公平性（AI Fairness）是什么，一文看懂

What is Artificial Intelligence Fairness (AI Fairness) in one article

AI fairness is the interdisciplinary field of ensuring that AI systems treat all individuals and groups in a fair and unbiased manner throughout their design, development, deployment, and operation lifecycle.

9mos ago

046.4K

元学习（Meta-Learning）是什么，一文看懂

What is Meta-Learning (Meta-Learning) in one article?

Meta-Learning, or learning how to learn, is an important branch of the machine learning field that focuses on developing learning algorithms that can quickly adapt to new tasks.

9mos ago

051.8K

MobiAgent - 上海交大开源的移动端智能体全栈构建框架

MobiAgent - Shanghai Jiaotong University open source mobile intelligent body full-stack building framework

MobiAgent is an open source mobile intelligent body toolchain from IPADS Lab of Shanghai Jiaotong University, which helps users to build their own mobile intelligent assistants. By recording the user's operation trajectory and generating high-quality data, it trains an intelligent body that can understand natural language commands. Core features include efficient...

Latest AI Resources

9mos ago

046.5K

ZipVoice - 小米开源的语音合成系列模型

ZipVoice - Xiaomi's open source speech synthesis model series

ZipVoice is a series of speech synthesis (TTS) models based on the Flow Matching architecture released by Xiaomi, including ZipVoice (zero-sample single-speaker speech synthesis model) and ZipVoice-Dialog (zero-sample conversational speech synthesis...

Latest AI Resources

9mos ago

057.6K

PP-OCRv5 - 百度开源的新一代文字识别AI模型

PP-OCRv5 - Baidu's open source AI model for next-generation text recognition

PP-OCRv5 is the latest generation of text recognition AI model released by Baidu. With a lightweight design and a reference count of only 0.07B, it is suitable for efficient operation on CPU and edge devices, and can process more than 370 characters per second. The model supports Simplified Chinese, Traditional Chinese, English, Japanese and Pinyin...

Latest AI Resources

9mos ago

071.4K

Youtu-GraphRAG - 腾讯优图实验室开源的图检索增强生成框架

Youtu-GraphRAG - Tencent Youtu Labs Open Source Graph Retrieval Augmentation Generation Framework

Youtu-GraphRAG is an open source graph retrieval augmentation generation framework from Tencent's Youtu Labs to help large language models handle complex Q&A tasks more accurately. By constructing a four-layer knowledge tree, the knowledge is disassembled into four levels of attributes, relationships, keywords and communities to realize the self-directed performance of cross-domain knowledge...

Latest AI Resources

9mos ago

048.5K

Stand-In - 腾讯微信视觉开源的轻量级视频生成框架

Stand-In - Tencent WeChat Visual Open Source Lightweight Video Generation Framework

Stand-In is a lightweight, plug-and-play identity-preserving video generation framework from Tencent's WeChat Vision team. Focusing on preserving specific identity features in video generation, it only needs to train the additional parameters of the base model 1%, and can achieve excellent results in face similarity and naturalness.

Latest AI Resources

9mos ago

047.4K

IndexTTS2 - B站开源的免费TTS模型，首个支持精确时长控制

IndexTTS2 - B station open source free TTS model, the first to support precise duration control

IndexTTS2 is a new free text-to-speech (TTS) model open-sourced by the B station voice team, which realizes a major breakthrough in emotional expression and duration control, the first autoregressive TTS model that supports precise duration control. Supports zero-sample voice cloning, only one audio file can accurately copy the sound...

Latest AI Resources

9mos ago

0110.8K

MiniMax Music 1.5 - MiniMax最新推出的AI音乐生成模型

MiniMax Music 1.5 - MiniMax's latest AI music generation model

MiniMax Music 1.5 is an advanced AI music generation tool that supports generating up to 4 minutes of music based on users' natural language descriptions. The model supports a variety of music styles and mood customization, generating a natural and full vocal color, smooth transitions, richly layered arrangements...

Latest AI Resources

10mos ago

048.6K

人工智能安全（AI Safety）是什么，一文看懂

What is Artificial Intelligence Safety (AI Safety), in one article

Artificial Intelligence Safety (AI Safety) is the cutting-edge interdisciplinary field of ensuring that AI systems, especially those that are increasingly powerful and autonomous, act reliably and predictably throughout their lifecycle in accordance with human intent, without harmful consequences.

10mos ago

045.7K

自监督学习（Self-Supervised Learning）是什么，一文看懂

What is Self-Supervised Learning (SSL) in one article?

Self-Supervised Learning (SSL) is an emerging learning paradigm in the field of machine learning, where the core idea is to automatically generate supervised signals from unlabeled data and train models to learn useful representations of the data.

10mos ago

046K

超人工智能 ASI（Artificial Super Intelligence）是什么，一文看懂

Super Artificial Intelligence (ASI) What is ASI (Artificial Super Intelligence) in one article?

Artificial Super Intelligence (ASI) is an intelligent system that exceeds human intelligence, with capabilities that surpass those of humans in all domains, including cognition, creativity, problem solving, and decision-making.

10mos ago

059.6K

迁移学习（Transfer Learning）是什么，一文看懂

Transfer Learning (Transfer Learning) what is it, an article to read and understand

Transfer Learning (Transfer Learning) is an important branch in the field of machine learning, the core idea is to apply the knowledge learned from one task or domain to another related but different task or domain.

10mos ago

045.4K

HuMo - 清华大学联合字节开源的多模态视频生成框架

HuMo - Tsinghua University United Bytes open source multimodal video generation framework

HuMo is a multi-modal video generation framework jointly open-sourced by Tsinghua University and ByteDance Intelligent Creation Lab, focusing on human-centered video generation. It can generate high-quality, fine-grained and controllable human videos from a variety of modal inputs such as text, images and audio.HuMo supports a powerful text cue-following capability...

Latest AI Resources

10mos ago

0129.9K

AnyI2V - 复旦联合阿里达摩院等开源的智能图像动画生成框架

AnyI2V - Fudan, Ali Dharma Institute and other open source framework for intelligent image animation generation

AnyI2V is an image animation generation framework jointly launched by Fudan University, Alibaba Dharma Institute and others, which supports the conversion of static conditional images (e.g., grids, point clouds, etc.) into dynamic videos without the need for complex training processes and large amounts of data.

Latest AI Resources

10mos ago

043K

SRPO - 腾讯混元推出的文本到图像生成模型

SRPO - Text-to-Image Generation Model launched by Tencent Mixed Meta

SRPO (Semantic Relative Preference Optimization) is a text-to-image generation model introduced by Tencent Hybrid, which optimizes the reward mechanism through text conditioned signals to achieve online adjustment of rewards and reduce offline fine-tuning dependency.

Latest AI Resources

10mos ago

057.3K

Qwen3-Next - 阿里通义推出的最新基础模型

Qwen3-Next - the latest base model from Ali Tongyi

Qwen3-Next is a new generation of hybrid architecture big model open source by Ali Tongyi, combining Gated DeltaNet and Gated Attention technology, good at dealing with long text, fast inference and saving computing resources.

Latest AI Resources

10mos ago

043.5K

文心大模型X1.1 - 百度推出的深度思考模型，理解能力更强

Wenshin Big Model X1.1 - Baidu's Deep Thinking Model for Better Understanding

Wenxin Big Model X1.1 is a deep thinking model launched by Baidu, based on a hybrid reinforcement learning framework that focuses on improving language understanding and generation. The model excels in handling complex questions, following instructions and simulating the behavior of intelligences, and can accurately provide knowledgeable answers and high-quality text content.

Latest AI Resources

10mos ago

049.5K

混元图像2.1 - 腾讯推出的开源文生图模型

Hybrid Image 2.1 - Tencent's Open Source Vendor Graph Model

HunyuanImage 2.1 is Tencent's open source graphic model, designed for high-quality image generation. The model supports native 2K resolution, can accurately render complex scenes and details, so that the character's expression and movement can be vividly reproduced.

Latest AI Resources

10mos ago

044.6K

AntSK FileChunk - 免费的AI语义文档切片工具，动态切片调整

AntSK FileChunk - Free AI Semantic Document Slicing Tool, Dynamic Slicing Adjustment

AntSK FileChunk is a free intelligent document slicing tool designed for RAG (Retrieval Augmented Generation) applications. Semantic as the core, the document will be intelligently sliced into semantically complete, coherent segments , support for multi-language , can dynamically adjust the size of the slice to ensure that the context of coherence.

Latest AI Resources

10mos ago

051.6K

UnifiedTTS - 一站式TTS API服务平台，实时性能监控

UnifiedTTS - One-stop TTS API Service Platform, Real-time Performance Monitoring

UnifiedTTS is a one-stop platform for text-to-speech (TTS) services. It supports multiple languages, including Chinese, English, Japanese and Korean, to meet the needs of global business. Through a unified API interface, it integrates many mainstream TTS services, including Micro...

Latest AI Resources

10mos ago

055.2K

MiniCPM 4.1 - 面壁智能推出的超高效端侧大模型

MiniCPM 4.1 - Ultra-efficient end-side grand model introduced by Facing Face Intelligence

MiniCPM 4.1 is an ultra-efficient end-side large language model introduced by Facade Intelligence. With InfLLM v2 sparse attention architecture, each lexeme only needs to calculate the relevance to less than 5% lexemes, which significantly reduces the processing overhead of long text. In a 128K long text scenario...

Latest AI Resources

10mos ago

045.1K

WeKnora - 腾讯微信开源的文档理解与语义检索框架

WeKnora - Tencent WeChat Open Source Document Understanding and Semantic Retrieval Framework

WeKnora is Tencent WeChat team open source based on the Large Language Model (LLM) document understanding and semantic retrieval framework , designed for the structure of complex, heterogeneous document content scenarios and designed to use a modularized architecture , integration of multimodal preprocessing , semantic vector indexing , intelligent recall and large model generative reasoning ...

Latest AI Resources

10mos ago

090.2K

XTuner V1 - 上海AI Lab开源的大模型训练引擎

XTuner V1 - Shanghai AI Lab open source large model training engine

XTuner V1 is a new generation of large model training engine open-sourced by Shanghai Artificial Intelligence Laboratory (SAL), designed for ultra-large scale sparse Mixed Expert (MoE) model training. Developed based on PyTorch FSDP, it achieves high performance through multi-dimensional optimization of memory, communication and load ...

Latest AI Resources

10mos ago

046.4K

Qwen3-ASR-Flash - 阿里通义千问推出的系列语音识别模型

Qwen3-ASR-Flash - A series of speech recognition models launched by Ali Tongyi Qianqian

Qwen3-ASR-Flash is Alibaba's latest high-precision speech recognition model, based on the Qwen3 base model, trained on massive multimodal data. It supports 11 languages and multiple accents, including Mandarin, Sichuan, Minnan, Wu, Cantonese and other dialects...

Latest AI Resources

10mos ago

059.3K

人工智能治理（AI Governance）是什么，一文看懂

What is Artificial Intelligence Governance (AI Governance) in One Article

AI governance is a comprehensive framework covering technology, ethics, law, and society that effectively guides, manages, and oversees the entire lifecycle of AI systems-from design, development, deployment, and end use. The core goal is not to hinder technological innovation, but to ensure that the development and application of AI technologies begin...

10mos ago

053.5K

吴恩达的LangChain for LLM应用开发免费课程

Free LangChain for LLM Application Development Course by Ernest Ng

LangChain for LLM Application Development is an online course presented by DeepLearning.AI, featuring LangChain founder Harrison Chase and Andrew Ng.

Latest AI Resources Course materials

10mos ago

068.3K

吴恩达的Transformer LLMs工作原理免费课程

Free course on how Transformer LLMs work by Enda Wu

Transformer LLMs work on the principle that DeepLearning.AI and Jay Alammar and Maarten Grootend, authors of Hands-On Large Language Models...

Latest AI Resources Course materials

10mos ago

063.1K

半监督学习（Semi-Supervised Learning）是什么，一文看懂

What is Semi-Supervised Learning (SSL) in one article?

Semi-supervised learning is an important branch in the field of machine learning, which uses a small amount of labeled data and a large amount of unlabeled data to co-train a model to improve the learning effect and generalization ability.

10mos ago

052.8K

无监督学习（Unsupervised Learning）是什么，一文看懂

What is Unsupervised Learning (ULS) in one article?

Unsupervised Learning (ULS) is an important branch of machine learning that focuses on processing data sets that are not pre-labeled.

10mos ago

044.7K

Seedream 4.0 - 字节推出的最新一代图像创作模型

Seedream 4.0 - the latest generation of image creation models launched by Bytes

Seedream 4.0 is an advanced image generation and editing tool launched by ByteDance, centered on the integration of generation and editing, with powerful features such as precise command editing, high feature retention, and deep intent understanding.

Latest AI Resources

10mos ago

091.9K

rStar2-Agent - 微软开源的高效AI推理模型

rStar2-Agent - Microsoft's Open Source Efficient AI Reasoning Model

rStar2-Agent is an advanced AI mathematical reasoning model open-sourced by Microsoft that demonstrates strong mathematical problem solving capabilities by achieving an accuracy of 80.61 TP3T in the AIME24 test. The model is equipped with scientific reasoning capabilities, achieving in the GPQA-Diamond benchmark...

Latest AI Resources

10mos ago

046.1K

Qwen3-Max-Preview - 通义千问推出的旗舰大语言模型

Qwen3-Max-Preview - The Flagship Big Language Model from Tongyi Qianqian

Qwen3-Max-Preview is the latest flagship large language model released by Tongyi Qianwen. It is the model with the largest number of parameters in the Qwen3 family, with a parameter size of over 1 trillion. The model has significant improvements in inference, instruction following, multi-language support and long-tail knowledge coverage...

Latest AI Resources

10mos ago

049.7K

OneCAT - 美团联合上海交大开源的多模态模型

OneCAT - Open source multimodal modeling by Meituan and Shanghai Jiaotong University

OneCAT is a new unified multimodal model launched by Meituan in conjunction with Shanghai Jiaotong University, which adopts a pure decoder architecture and can seamlessly integrate multimodal comprehension, text-to-image generation and image editing functions. The model abandons the design of traditional multimodal models that rely on external visual coders and disambiguators through modality-specific...

Latest AI Resources

10mos ago

048.1K

Claudable - 开源AI Web应用构建器，自然语言生成代码

Claudable - Open Source AI Web Application Builder, Natural Language Generated Code

Claudable is an open source web application builder based on Next.js that combines the advanced AI agent capabilities of Claude Code and Cursor CLI with Lovable's simple and intuitive application building experience....

Latest AI Resources

10mos ago

053K

FineVision - Hugging Face推出的开源视觉语言数据集

FineVision - Open Source Visual Language Dataset from Hugging Face

FineVision is Hugging Face's open source visual language dataset for training advanced visual language models. It contains 17.3 million images, 24.3 million samples, 88.9 million rounds of dialog, and 9.5 billion answer tokens. The dataset aggregates...

Latest AI Resources

10mos ago

051.1K

InfinityHuman - 字节联合浙大推出的长视频数字人生成模型

InfinityHuman - Long video digital human generation model launched by Bytes in collaboration with ZJU

InfinityHuman is a commercial-grade long time-series audio-driven character video generation model jointly launched by ByteDance and Zhejiang University. The model is audio-driven and can generate high-resolution, long duration and visually consistent character videos.

Latest AI Resources

10mos ago

048.4K

Kimi K2-0905 - 月之暗面推出的最新模型版本

Kimi K2-0905 - The latest model release from Dark Side of the Moon!

Kimi K2-0905 is an advanced AI model from Dark Side of the Moon Technologies Ltd. that excels in programming assistance, generates code efficiently, and supports the generation of neat and standardized code in front-end development. The model context length is extended to 256K to handle complex tasks.

Latest AI Resources

10mos ago

085.7K

强化学习（Reinforcement Learning）是什么，一文看懂

What is Reinforcement Learning in one article?

Reinforcement learning is an important branch of machine learning that centers on allowing intelligences to autonomously learn how to make optimal decisions to maximize long-term cumulative rewards through continuous interaction with the environment.

10mos ago

044.2K

监督学习（Supervised Learning）是什么，一文看懂

Supervised Learning (Supervised Learning) what is it, an article to understand

Supervised learning is one of the most common and basic methods of machine learning, the core idea is to teach computer models how to make predictions or judgments through existing data sets with "correct answers".

10mos ago

047.4K

深度学习（Deep Learning）是什么，一文看懂

Deep Learning (Deep Learning) is what, an article to understand

Deep Learning (DL) is a branch of machine learning that centers on the use of multi-layer artificial neural networks to learn and represent complex patterns in data.

10mos ago

047.5K

HunyuanWorld-Voyager - 腾讯开源的超长漫游世界模型

HunyuanWorld-Voyager - Tencent open source ultra-long roaming world model

HunyuanWorld-Voyager (Hunyuan Voyager for short) is the industry's first ultra-long roaming world model released by Tencent that supports native 3D reconstruction. It is a novel video diffusion framework that generates a 3D point cloud sequence of user-defined camera paths from a single image, supporting...

Latest AI Resources

10mos ago

050.2K

Hunyuan-MT-7B - 腾讯混元开源的轻量级翻译模型

Hunyuan-MT-7B - Tencent Mixed Meta Open Source Lightweight Translation Model

Hunyuan-MT-7B is a lightweight translation model introduced by Tencent's Mixed Meta Team, with 7 billion references, supporting the mutual translation of 33 languages and 5 folk-Chinese languages/dialects, including Cantonese, Uyghur, and Tibetan. In the International Association for Computational Linguistics (ACL) WMT2025 competition...

Latest AI Resources

10mos ago

046.7K