Latest AI Resources

Total 2950 articles posts
吴恩达的《Agentic AI》最新智能体免费课程

Free Course on the Latest Intelligentsia from Agentic AI by Ernest Ng

Agentic AI is the newest course on intelligent bodies launched by Ernest Ng.The course focuses on the design and construction of intelligent bodies, covering the four major design patterns of reflection, tool use, planning, and multi-intelligent body collaboration. Learners will master how to make intelligent bodies check outputs, autonomously adjust through theoretical explanations and code practice...
4mos ago
039K
聆音EchoCare - 香港科学院开源的超声基座大模型

EchoCare - Hong Kong Academy of Sciences open source ultrasound base large model

EchoCare is a large model of ultrasound base developed by the Center for Artificial Intelligence and Robotics Innovation (CAIR) at the Hong Kong Institute of Innovation and Research of the Chinese Academy of Sciences (CAS), trained based on the world's largest ultrasound image dataset (more than 4.5 million images), covering multi-center, multi-region, multi-ethnicity, and more than 50 individuals...
4mos ago
025.1K
RoboBrain-X0 - 智源研究院开源的零样本跨本体泛化具身模型

RoboBrain-X0 - Wisdom Source Research Institute open source zero-sample cross ontology generalized embodiment model

RoboBrain-X0 is the world's first open source embodied model that supports zero-sample cross-ontology generalization open-sourced by Wisdom Source Research Institute, which is of great industrial significance. It can drive multiple real robots of different configurations to complete basic operation tasks without fine-tuning, and after a small amount of sample fine-tuning, it demonstrates the ability to replicate ...
4mos ago
025.4K
CWM - Meta FAIR开源的代码世界语言模型

CWM - Meta FAIR open source code world language model

CWM (Code World Model) is a 32-billion-parameter open-source world language model released by the Meta FAIR team, designed for code generation and reasoning. Introducing the concept of "world model", it can simulate the code execution process, predict the variable state changes, and advance...
4mos ago
028.4K
Neovate Code - 蚂蚁开源的智能编程助手

Neovate Code - Ant Open Source's Intelligent Programming Assistant

Neovate Code is an open source intelligent programming assistant from Ant Group's Alipay Experience Technology Department, which improves development efficiency through artificial intelligence technology. With conversational development features, developers can describe the requirements through natural language, Neovate Code can understand and generate the corresponding generation...
4mos ago
032.1K
Qwen3Guard - 阿里Qwen开源的安全模型

Qwen3Guard - Ali Qwen open source security model

Qwen3Guard is a fine-tuned security protection model based on the Qwen3 base model, designed for security detection. It provides accurate security categorization of prompts and responses, provides risk levels, and supports English, Chinese, and multi-language environments.Qwen3Guard comes with two pro...
4mos ago
035.2K
Qwen3-TTS-Flash - 阿里通义推出的语音合成模型

Qwen3-TTS-Flash - Speech Synthesis Models by Ali Tongyi

Qwen3-TTS-Flash is an advanced speech synthesis model introduced by Ali Tongyi, supporting 17 tones and 10 languages, covering Mandarin, English, dialects, etc. It has excellent stability and high expressiveness of Chinese and English speech, and the model can automatically adjust the tone of voice to make it more vivid.
4mos ago
043.5K
InternVLA-A1 - 上海AI Lab开源一体化操作能力的具身大模型

InternVLA-A1 - Shanghai AI Lab Open Source Integration of Operational Capabilities for Embodied Large Models

InternVLA-A1 is a large model of embodied operation open-sourced by Shanghai Artificial Intelligence Laboratory. It has the ability to understand, imagine, and execute the integration, and can accurately complete the task. The model fuses real and simulated operational data, and automates the construction of massive multimodal through large-scale virtual-real hybrid scene assets...
5mos ago
032.5K
VoxCPM - 面壁智能联合清华开源的端到端TTS模型

VoxCPM - Faceted Intelligence and Tsinghua Open Source End-to-End TTS Model

VoxCPM is a speech generation model jointly open-sourced by Facade Intelligence and Shenzhen International Graduate School of Tsinghua University.VoxCPM adopts an end-to-end diffusion autoregressive architecture to generate continuous speech representations directly from text, breaking through the limitations of traditional discrete disambiguation. Through hierarchical language modeling and finite state quantization...
5mos ago
037K
InternVLA·N1 - 上海AI Lab开源的端到端双系统导航大模型

InternVLA-N1 - Shanghai AI Lab Open Source End-to-End Dual System Navigation Large Model

InternVLA-N1 is an open source end-to-end dual-system navigation macromodel from Shanghai Artificial Intelligence Laboratory. Using a dual-system architecture, System 2 is responsible for understanding linguistic commands and planning long-range paths, while System 1 focuses on high-frequency response and agile obstacle avoidance. The model is trained entirely based on synthetic data through large-scale digital ...
5mos ago
031.2K
VLAC - 上海AI Lab开源的具身奖励大模型

VLAC - Shanghai AI Lab's Open Source Large Model of Embodied Reward

VLAC is an open source embodied reward macromodel from Shanghai Artificial Intelligence Laboratory. Based on InternVL multimodal macromodel, it integrates Internet video data and robot operation data to provide process reward and task completion estimation for robot reinforcement learning in the real world.VLAC can effectively ...
5mos ago
026.6K
浙江大学免费PDF资料《大模型基础》 - 附下载链接

Free PDF of Fundamentals of Large Models from Zhejiang University - with download link

Fundamentals of Large Models provides an in-depth analysis of the core technologies and practical paths of Large Language Models (LLMs). Starting from the fundamental theory of language modeling, it systematically explains the principles of model design based on statistics, recurrent neural networks (RNN), and Transformer architecture, focusing on the three major big language model...
5mos ago
033.9K
MobiAgent - 上海交大开源的移动端智能体全栈构建框架

MobiAgent - Shanghai Jiaotong University open source mobile intelligent body full-stack building framework

MobiAgent is an open source mobile intelligent body toolchain from IPADS Lab of Shanghai Jiaotong University, which helps users to build their own mobile intelligent assistants. By recording the user's operation trajectory and generating high-quality data, it trains an intelligent body that can understand natural language commands. Core features include efficient...
5mos ago
031.6K
Youtu-GraphRAG - 腾讯优图实验室开源的图检索增强生成框架

Youtu-GraphRAG - Tencent Youtu Labs Open Source Graph Retrieval Augmentation Generation Framework

Youtu-GraphRAG is an open source graph retrieval augmentation generation framework from Tencent's Youtu Labs to help large language models handle complex Q&A tasks more accurately. By constructing a four-layer knowledge tree, the knowledge is disassembled into four levels of attributes, relationships, keywords and communities to realize the self-directed performance of cross-domain knowledge...
5mos ago
031.7K
MiniMax Music 1.5 - MiniMax最新推出的AI音乐生成模型

MiniMax Music 1.5 - MiniMax's latest AI music generation model

MiniMax Music 1.5 is an advanced AI music generation tool that supports generating up to 4 minutes of music based on users' natural language descriptions. The model supports a variety of music styles and mood customization, generating a natural and full vocal color, smooth transitions, richly layered arrangements...
5mos ago
031.6K
文心大模型X1.1 - 百度推出的深度思考模型,理解能力更强

Wenshin Big Model X1.1 - Baidu's Deep Thinking Model for Better Understanding

Wenxin Big Model X1.1 is a deep thinking model launched by Baidu, based on a hybrid reinforcement learning framework that focuses on improving language understanding and generation. The model excels in handling complex questions, following instructions and simulating the behavior of intelligences, and can accurately provide knowledgeable answers and high-quality text content.
5mos ago
031.9K
WeKnora - 腾讯微信开源的文档理解与语义检索框架

WeKnora - Tencent WeChat Open Source Document Understanding and Semantic Retrieval Framework

WeKnora is Tencent WeChat team open source based on the Large Language Model (LLM) document understanding and semantic retrieval framework , designed for the structure of complex, heterogeneous document content scenarios and designed to use a modularized architecture , integration of multimodal preprocessing , semantic vector indexing , intelligent recall and large model generative reasoning ...
5mos ago
064.5K
XTuner V1 - 上海AI Lab开源的大模型训练引擎

XTuner V1 - Shanghai AI Lab open source large model training engine

XTuner V1 is a new generation of large model training engine open-sourced by Shanghai Artificial Intelligence Laboratory (SAL), designed for ultra-large scale sparse Mixed Expert (MoE) model training. Developed based on PyTorch FSDP, it achieves high performance through multi-dimensional optimization of memory, communication and load ...
5mos ago
028.2K
OneCAT - 美团联合上海交大开源的多模态模型

OneCAT - Open source multimodal modeling by Meituan and Shanghai Jiaotong University

OneCAT is a new unified multimodal model launched by Meituan in conjunction with Shanghai Jiaotong University, which adopts a pure decoder architecture and can seamlessly integrate multimodal comprehension, text-to-image generation and image editing functions. The model abandons the design of traditional multimodal models that rely on external visual coders and disambiguators through modality-specific...
5mos ago
030.5K
Step-Audio 2 mini - 阶跃星辰开源的语音大模型

Step-Audio 2 mini - Step-Star Open Source Speech Megamodels

Step-Audio 2 mini is an open source end-to-end speech grand model of Step-Audio. It breaks through the traditional speech model structure and adopts the true end-to-end multimodal architecture, which directly transforms the original audio input into speech response output with lower latency, and understands paralinguistic information and non-vocal signals.
5mos ago
039.3K
InternVL3.5 - 上海AI实验室开源的多模态大模型

InternVL3.5 - Shanghai AI Lab Open Source Multimodal Large Models

InternVL3.5 (Shusheng-Wanxiang 3.5) is an open source multimodal large model of the Shanghai Artificial Intelligence Laboratory, the model is fully upgraded in terms of general ability, reasoning ability and deployment efficiency, providing nine sizes of versions from 1 billion to 241 billion parameters, covering different resource demand scenarios, including thick...
5mos ago
039.4K
FastVLM - 苹果公司推出的视觉语言模型

FastVLM - Visual Language Model from Apple

FastVLM (Fast Vision Language Model) is an efficient visual language model introduced by Apple Inc. With FastViTHD hybrid visual coder as the core, it incorporates convolutional and Transformer architectures to significantly reduce visual...
5mos ago
036.6K
Meeseeks - 美团开源的评估模型指令遵循能力的评测集

Meeseeks - Meeseeks open-source assessment set for evaluating the ability to follow model instructions

Meeseeks is an open source large model evaluation set used by the Meituan M17 team to evaluate the model's ability to follow instructions.Meeseeks uses a three-tiered evaluation framework to comprehensively measure whether the model is able to generate answers in strict accordance with the user's instructions from the macro to the micro level, without evaluating the knowledge of the content of the answers positively ...
5mos ago
032.9K
gpt-realtime - OpenAI最新推出的AI语音模型

gpt-realtime - OpenAI's newest AI speech model

gpt-realtime is an advanced speech model from OpenAI that supports direct audio processing to generate natural and smooth speech. The model supports multiple languages and styles, understands non-verbal cues such as laughter, and can switch between languages.
5mos ago
036.2K
HunyuanVideo-Foley - 腾讯推出的开源视频音效生成模型

HunyuanVideo-Foley - Tencent's Open Source Video Sound Generation Model

HunyuanVideo-Foley is an open source video sound generation model by the Tencent Mixed Yuan team that supports adding accurately matched sound effects to silent videos. The model is based on a large-scale dataset training , with a multimodal diffusion transformer architecture , combined with the characterization of the alignment loss function and audio VAE optimization techniques ...
5mos ago
044.1K
问小白5 - 问小白推出的全能AI模型

Ask White 5 - All-in-One AI Model from Ask White

Ask White 5 is the flagship "All in One" model with a very high level of intelligence. The model has excellent performance in many assessments, such as the AA-Index composite assessment score of 64.7 and the STEM ability assessment score of 86, which is close to the world's leading GPT-5.
5mos ago
035.9K