Latest AI Resources

Total 2872 articles posts
Wan-Move - 阿里通义联合清华等开源的AI视频生成框架

meso- (chemistry)Wan-Move - Ali Tongyi's open source AI video generation framework with Tsinghua and others

Wan-Move is an open source AI video generation framework jointly developed by Ali Tongyi Labs, Tsinghua University and other organizations, focusing on high-quality video synthesis through precise motion control technology. The core technology is "potential trajectory guidance", which can seamlessly add point-level motion control to the existing image-to-video model...
1dys ago
03.7K
Kaleido - 智谱AI联合清华大学等开源的多主体参考视频生成模型

Kaleido - A multi-subject reference video generation model open-sourced by Smart Spectrum AI in collaboration with Tsinghua University and others

Kaleido is an open source multi-subject reference video generation model jointly developed by Hefei University of Technology, Tsinghua University and Smart Spectrum AI. It generates subject-consistent videos through multiple reference images, solving the deficiencies of existing models in multi-subject consistency and background decoupling.Kaleido generates videos through specialized data...
2dys ago
04.8K
Paper2Slides - 香港大学开源的学术论文转为幻灯片AI工具

Paper2Slides - HKU open source academic papers into slides AI tool

Paper2Slides is an open source AI tool from the Data Intelligence Laboratory of the University of Hong Kong that converts academic papers into professional slides or posters in one click. Using RAG (Retrieval Augmented Generation) technology, directly parsing the document content rather than relying on network information, to ensure that the generated PPT is highly consistent with the original...
2dys ago
04.8K
VoxCPM 1.5 - 面壁智能开源的端到端文本到语音模型

VoxCPM 1.5 - Faceted Intelligence Open Source End-to-End Text-to-Speech Modeling

VoxCPM 1.5 is an open source speech generation model released by Facade Intelligence, based on text-to-speech (TTS) technology without the need for a splitter, featuring several innovations and improvements. Adopting an end-to-end diffusion autoregressive architecture, it generates continuous speech waveforms directly from text, avoiding the limitations of traditional segmentation methods...
7dys ago
09.2K
OpenAutoGLM - 智谱AI开源的手机AI Agent模型

OpenAutoGLM - Smart Spectrum AI open source cell phone AI Agent model

OpenAutoGLM is an open source intelligent body model with the ability of "cell phone use", which can understand the content of the cell phone screen through multi-modal perception, and automatically generate the operation flow to complete the user-specified tasks. Users only need to use natural language to describe the needs, such as "open Meituan to search for nearby hot pot ...
1wks ago
013.5K
InkSight - Google开源的AI手写识别工具

InkSight - Google's open source AI handwriting recognition tool

InkSight is Google's open source AI handwriting recognition tool that converts paper handwritten notes into editable digital inked files (e.g. SVG format). Unlike traditional OCR , can recognize text content , can restore the handwriting style , paragraph structure and focus marking , support for multi-language processing .
1wks ago
06.8K
RoboCOIN - 智源联合多所高校开源的双臂机器人真机数据集

RoboCOIN - A real robot dataset for dual-armed robots open-sourced by Wisdom Source in collaboration with several universities

RoboCOIN is the world's first large-scale dual-arm robot real machine dataset open-sourced by Beijing Zhiyuan Artificial Intelligence Research Institute in conjunction with a number of enterprises and colleges and universities, which contains 15 types of robot platforms, 180,000 real operation trajectories, and 421 types of task scenarios. The most important feature is the use of hierarchical annotation system to disassemble the task ...
2wks ago
08.3K
MemMachine - MemVerge推出的开源AI记忆系统

MemMachine - Open Source AI Memory System by MemVerge

MemMachine is an open source AI memory system developed by MemVerge, designed for AI models and intelligences, which can store and recall interaction data like the human brain, solving the problem of AI "stateless memory loss". It adopts a layered architecture (short-term memory, long-term memory, user image...
2wks ago
012.1K
Vidi2 - 字节跳动开源的多模态视频理解与生成大模型

Vidi2 - ByteHop's open source multimodal video understanding and generation of large models

Vidi2 is a second-generation multimodal video understanding and generation big model open-sourced by ByteDance, focusing on video content understanding, analysis and creation. It supports joint input of text, video, and audio modalities, and can simultaneously understand picture content, sound information, and natural language commands to achieve cross-modal interaction and push...
2wks ago
010.2K
ViMax - 香港大学开源的多智能体视频生成框架

ViMax - Open Source Multi-intelligent Body Video Generation Framework at the University of Hong Kong

ViMax is an open source multi-intelligence body video generation framework from the Data Science Laboratory of the University of Hong Kong, which can automate the whole process from creative input to video output. Integration of script generation , scene design , shot planning and video rendering and other functions , to support users to generate coherent film and television grade video through natural language description ...
3wks ago
019.3K
HunyuanOCR - 腾讯混元开源的光学字符识别专家模型

HunyuanOCR - Tencent's open source expert model for optical character recognition

HunyuanOCR is a high-performance optical character recognition model open-sourced by the Tencent hybrid team, with a reference number of only 1 billion. Developed based on the hybrid multimodal architecture, it adopts an end-to-end design and can efficiently handle text detection, recognition and document parsing tasks. The model scored 94.1 points in the complex document test, surpassing...
3wks ago
015.7K
Awex - 蚂蚁集团开源的高性能权重交换框架

Awex - Ant Group open source high performance weight exchange framework

Awex is the Ant Group open source high performance weight exchange framework, designed for large-scale parameter synchronization in reinforcement learning. It can complete terabytes of parameter exchange in seconds, significantly improving the efficiency of training and inference.Awex has a very fast synchronization performance, in a thousand card cluster, trillion parameter models can be completed within 6 seconds of the full amount of...
4wks ago
015.5K
LoopTool - 上海交大联合小红书开源的自动化工具调用数据进化框架

LoopTool - Shanghai Jiaotong University and Little Red Book open source automated tool to call the data evolution framework

LoopTool is an automated tool-call data evolution framework open-sourced by Shanghai Jiao Tong University and Little Red Book team, designed to improve the tool-call capability of large language models. It optimizes data generation and model training through closed-loop iteration, using open-source models (e.g., Qwen3-32B) as data generation...
4wks ago
014.5K
ChatTutor - 开源的AI教学辅助工具,可视化互动学习

ChatTutor - Open source AI teaching aid to visualize interactive learning

ChatTutor is an open source AI teaching aid focused on visual and interactive learning of STEM subjects. Through the multi-intelligent body architecture to achieve dialogical Q&A and dynamic drawing function, can draw math graphs, physics circuits or mind maps on the whiteboard in real time, to help users intuitively understand the abstract generalization ...
4wks ago
010K
EverMemOS - 盛大团队推出的开源长期记忆操作系统

EverMemOS - Open Source Long-Term Memory Operating System by Team Shanda

EverMemOS is an open source long-term memory operating system launched by the Shanda team led by Chen Tianqiao, designed for AI intelligences to solve the problem of memory breakage caused by the fixed context window of large language models. The system is based on the human brain memory mechanism, using a four-layer architecture (agent layer, memory layer, index layer...
1mos ago
013.7K
Kosong - Moonshot AI开源的全新AI Agent开发框架

Kosong - Moonshot AI's New Open Source AI Agent Development Framework

Kosong is a new AI Agent development framework open-sourced by Dark Side of the Moon (Moonshot AI) that provides developers with a lightweight, flexible, and highly scalable underlying support for building next-generation intelligent body applications. With an asynchronous tool orchestration engine that efficiently schedules multiple tools...
1mos ago
014.7K
SenseNova-SI - 商汤科技开源的空间智能大模型系列

SenseNova-SI - A Family of Open Source Spatial Intelligence Large Models from ShangTech

SenseNova-SI is an open source spatial intelligence grand model released by ShangTech, focusing on improving AI's ability in spatial understanding and reasoning. The model excels in six core dimensions, including spatial measurement, reconstruction, relationship judgment, perspective transformation, deformation analysis, and spatial reasoning, significantly outperforming other...
1mos ago
012.5K
NocoBase - 免费开源的AI无代码开发平台,可视化构建应用

NocoBase - Free and open source AI no-code development platform to build apps visually

NocoBase is based on AI-driven open-source no-code development platform that supports the rapid construction of business systems, without programming to complete the application development through configuration. The project uses Apache-2.0 protocol , provides private deployment and flexible scalability , suitable for enterprise management , collaboration platforms and other fields ...
1mos ago
010.6K
UniWorld V2 - 兔展智能联合北大推出的新一代图像编辑模型

UniWorld V2 - A New Generation of Image Editing Models Launched by Rabbit Show Intelligence in Association with Peking University

UniWorld V2 is a new generation of image editing model jointly launched by RabbitZhan Intelligence and UniWorld team of Peking University. It has significant advantages in the field of image editing, especially in Chinese comprehension and execution of complex commands. The model can accurately render artistic Chinese fonts and support fine...
1mos ago
014.6K
Handy - 开源免费的本地AI语音转文字工具

Handy - Open Source Free Native AI Speech to Text Tool

Handy is open source and free local speech to text tool, supporting Windows, MacOS and Linux systems, developed by Rust and React. It is suitable for quick transcription and text input by processing voice data locally without uploading it to the cloud to ensure privacy and security.
1mos ago
018.6K
Petri - Anthropic开源的 AI 安全审计框架

Petri - Anthropic's open source AI security auditing framework

Petri is an open source AI security auditing framework developed by Anthropic that systematically assesses the security and behavioral alignment of AI models. By simulating a real-world scenario where an automated auditor engages in multiple rounds of conversations with a target model, followed by a judge agent that acts on the model's...
1mos ago
014K
OmniVinci - NVIDIA开源的全模态大语言模型

OmniVinci - NVIDIA's Open Source Omnimodal Large Language Model

OmniVinci is an open-source, fully modal large-scale language model developed by NVIDIA that solves the problem of modal fragmentation in multimodal models through architectural innovation and data optimization. Alignment of visual and audio embeddings is enhanced by OmniAlignNet, which utilizes temporally embedded group capture...
2mos ago
018.2K
ValueCell - 开源的多智能体金融平台,多个Agent分工协作

ValueCell - Open Source Multi-Intelligence Financial Platform with Multiple Agents to Divide the Work

ValueCell is an open source multi-intelligent body financial application platform that improves the efficiency of financial analysis and investment management through AI technology. Simulating a professional investment team, multiple AI intelligences work together, covering market analysis, sentiment analysis, fundamental research, automated trading and other functions, to provide users with a comprehensive...
2mos ago
037K
Dexbotic - 原力灵机开源的具身智能VLA模型一站式科研服务平台

Dexbotic - The Force Spirit machine open source body intelligence VLA model one-stop research service platform

Dexbotic is the open source Visual-Linguistic-Action (VLA) model of embodied intelligence one-stop scientific research service platform of Dexmal, which solves the problems of fragmentation and low efficiency of research in the field of embodied intelligence. Based on PyTorch, Dexbotic is a one-stop research service platform to solve the problems of fragmentation and inefficiency in the field of embodied intelligence...
2mos ago
016.2K
LongCat-Video - 美团LongCat开源的视频生成模型

LongCat-Video - LongCat open source video generation model of the Mission

LongCat-Video is a 1.36 billion parameter video generation model open source by the LongCat team, using the MIT open source protocol, supporting three major tasks: text-generated video, graph-generated video and video continuation. The model through the "coarse to fine" generation strategy and block sparse attention mechanism, can be in a number of minutes ...
2mos ago
031.8K