Latest AI Resources

Total 2828 articles posts
Handy - 开源免费的本地AI语音转文字工具

Handy - Open Source Free Native AI Speech to Text Tool

Handy is open source and free local speech to text tool, supporting Windows, MacOS and Linux systems, developed by Rust and React. It is suitable for quick transcription and text input by processing voice data locally without uploading it to the cloud to ensure privacy and security.
3wks ago
013.3K
Petri - Anthropic开源的 AI 安全审计框架

Petri - Anthropic's open source AI security auditing framework

Petri is an open source AI security auditing framework developed by Anthropic that systematically assesses the security and behavioral alignment of AI models. By simulating a real-world scenario where an automated auditor engages in multiple rounds of conversations with a target model, followed by a judge agent that acts on the model's...
3wks ago
011.2K
OmniVinci - NVIDIA开源的全模态大语言模型

OmniVinci - NVIDIA's Open Source Omnimodal Large Language Model

OmniVinci is an open-source, fully modal large-scale language model developed by NVIDIA that solves the problem of modal fragmentation in multimodal models through architectural innovation and data optimization. Alignment of visual and audio embeddings is enhanced by OmniAlignNet, which utilizes temporally embedded group capture...
4wks ago
015.7K
ValueCell - 开源的多智能体金融平台,多个Agent分工协作

ValueCell - Open Source Multi-Intelligence Financial Platform with Multiple Agents to Divide the Work

ValueCell is an open source multi-intelligent body financial application platform that improves the efficiency of financial analysis and investment management through AI technology. Simulating a professional investment team, multiple AI intelligences work together, covering market analysis, sentiment analysis, fundamental research, automated trading and other functions, to provide users with a comprehensive...
1mos ago
027.8K
Dexbotic - 原力灵机开源的具身智能VLA模型一站式科研服务平台

Dexbotic - The Force Spirit machine open source body intelligence VLA model one-stop research service platform

Dexbotic is the open source Visual-Linguistic-Action (VLA) model of embodied intelligence one-stop scientific research service platform of Dexmal, which solves the problems of fragmentation and low efficiency of research in the field of embodied intelligence. Based on PyTorch, Dexbotic is a one-stop research service platform to solve the problems of fragmentation and inefficiency in the field of embodied intelligence...
1mos ago
013.3K
LongCat-Video - 美团LongCat开源的视频生成模型

LongCat-Video - LongCat open source video generation model of the Mission

LongCat-Video is a 1.36 billion parameter video generation model open source by the LongCat team, using the MIT open source protocol, supporting three major tasks: text-generated video, graph-generated video and video continuation. The model through the "coarse to fine" generation strategy and block sparse attention mechanism, can be in a number of minutes ...
1mos ago
024.4K
混元世界模型1.1 - 腾讯混元发布的开源3D重建大模型

Mixed World Model 1.1 - Tencent Mixed World Released Open Source 3D Reconstructed Large Model

WorldMirror 1.1 (WorldMirror) is an open source 3D reconstruction of large models released by Tencent's WorldMirror team, which is an upgraded version of the WorldMirror series. It supports multi-view images, videos, and multi-modal a priori inputs such as camera position, internal reference, depth map, etc. It breaks through the traditional 3D reconstruction that only relies on...
1mos ago
018.4K
VitaBench - 美团LongCat开源的交互式Agent评测基准

VitaBench - MMT LongCat Open Source Interactive Agent Review Benchmarks

VitaBench is the first interactive Agent evaluation benchmark for complex life scenarios released by the LongCat team of Meituan, assessing the comprehensive capabilities of large model intelligences in real life scenarios. The three high-frequency life scenarios of take-away ordering, restaurant dining, and traveling are used as the carrier to build the package...
1mos ago
017K
UniPixel - 香港理工、腾讯、中科院等开源的像素级多模态模型

UniPixel - Pixel-level multimodal model open-sourced by Hong Kong Polytechnic, Tencent, Chinese Academy of Sciences and others

UniPixel is a novel multimodal model jointly proposed by Hong Kong Polytechnic University, Tencent, Chinese Academy of Sciences and Vivo to achieve pixel-level visual language understanding. By unifying object referencing and segmentation capabilities, it supports a variety of fine-grained tasks such as image segmentation, video segmentation, region understanding, and pi...
1mos ago
019.3K
DiaMoE-TTS - 清华联合巨人网络开源的多方言语音合成框架

DiaMoE-TTS - Tsinghua and Giant Networks open source multi-dialect speech synthesis framework

DiaMoE-TTS is a multi-dialect speech synthesis framework jointly open-sourced by Tsinghua University and Giant Network, based on the International Phonetic Alphabet (IPA), to solve the problems of dialect data scarcity, orthographic inconsistency, and complex phonological changes. Through a unified IPA front-end standardized phoneme representation to eliminate cross-dialect differences ...
1mos ago
019.6K
SongBloom - 腾讯联合港中文、南大开源的歌曲生成模型

SongBloom - Tencent's open source song generation model with HKCNU and NTU.

SongBloom is an open source song generation model developed by Tencent AI Lab in collaboration with The Chinese University of Hong Kong (Shenzhen) and Nanjing University, which solves the problem of "plasticity" in AI music generation, and realizes high-quality, structurally complete song generation. Simply enter 10 seconds of reference audio and corresponding lyrics, and you can...
1mos ago
017.9K
SAIL-VL2 - 字节跳动开源的多模态视觉语言模型

SAIL-VL2 - ByteHop's open source multimodal visual language model

SAIL-VL2 is an open source multimodal visual language model by the Byte Jump team, focusing on joint modeling of multimodal inputs such as images and text. Using the sparse mixture of experts (MoE) architecture and progressive training strategy, it achieves high performance at parameter scales from 2B to 8B, especially in the areas of graphic comprehension, math...
1mos ago
013.6K
MineContext - 字节开源的主动式上下文感知AI伙伴

MineContext - Bytes Open Source Active Context-Aware AI Partner

MineContext is an active context-aware AI partner open-sourced by the ByteDance Viking team to help users efficiently manage massive amounts of information and improve the efficiency of knowledge work. Over the screenshot and content understanding technology, automatically record the user's daily operations (such as browsing the web, editing documents, etc.), support...
1mos ago
022.3K
吴恩达的《Agentic AI》最新智能体免费课程

Free Course on the Latest Intelligentsia from Agentic AI by Ernest Ng

Agentic AI is the newest course on intelligent bodies launched by Ernest Ng.The course focuses on the design and construction of intelligent bodies, covering the four major design patterns of reflection, tool use, planning, and multi-intelligent body collaboration. Learners will master how to make intelligent bodies check outputs, autonomously adjust through theoretical explanations and code practice...
2mos ago
021.9K
聆音EchoCare - 香港科学院开源的超声基座大模型

EchoCare - Hong Kong Academy of Sciences open source ultrasound base large model

EchoCare is a large model of ultrasound base developed by the Center for Artificial Intelligence and Robotics Innovation (CAIR) at the Hong Kong Institute of Innovation and Research of the Chinese Academy of Sciences (CAS), trained based on the world's largest ultrasound image dataset (more than 4.5 million images), covering multi-center, multi-region, multi-ethnicity, and more than 50 individuals...
2mos ago
015.6K
RoboBrain-X0 - 智源研究院开源的零样本跨本体泛化具身模型

RoboBrain-X0 - Wisdom Source Research Institute open source zero-sample cross ontology generalized embodiment model

RoboBrain-X0 is the world's first open source embodied model that supports zero-sample cross-ontology generalization open-sourced by Wisdom Source Research Institute, which is of great industrial significance. It can drive multiple real robots of different configurations to complete basic operation tasks without fine-tuning, and after a small amount of sample fine-tuning, it demonstrates the ability to replicate ...
2mos ago
017.6K
CWM - Meta FAIR开源的代码世界语言模型

CWM - Meta FAIR open source code world language model

CWM (Code World Model) is a 32-billion-parameter open-source world language model released by the Meta FAIR team, designed for code generation and reasoning. Introducing the concept of "world model", it can simulate the code execution process, predict the variable state changes, and advance...
2mos ago
019.7K
Neovate Code - 蚂蚁开源的智能编程助手

Neovate Code - Ant Open Source's Intelligent Programming Assistant

Neovate Code is an open source intelligent programming assistant from Ant Group's Alipay Experience Technology Department, which improves development efficiency through artificial intelligence technology. With conversational development features, developers can describe the requirements through natural language, Neovate Code can understand and generate the corresponding generation...
2mos ago
021.7K