AI open source project

Total 1020 articles posts
MiniRAG:简化检索增强生成框架,实体图索引召回相关文本块

MiniRAG: Simplified Retrieval Enhanced Generation Framework, Entity Graph Index Recall Relevant Text Blocks

Comprehensive Introduction MiniRAG is an extremely simple Retrieval Augmented Generation (RAG) framework that aims to enable good RAG performance even for small models through heterogeneous graph indexing and lightweight topology-enhanced retrieval. It is developed by the Data Science Laboratory of the University of Hong Kong (HKUDS) to address ...
9mos ago
025.3K
AutoGPT:工作流自动化与自主执行任务的智能体构建平台

AutoGPT: Intelligent Body Building Platform for Workflow Automation and Autonomous Task Execution

General Description AutoGPT is a powerful platform designed to help users create, deploy and manage continuously running AI agents and automate complex workflows. Developed by Significant Gravitas, the platform offers a wide range of tools and features that enable users to focus...
10mos ago
025.2K
MegaParse:解析各类型文档为LLM可用数据,完整保留文档中的表格、图片等所有信息

MegaParse: parses all types of documents into LLM-available data, preserving all information in the document such as tables, pictures, etc. in its entirety

Comprehensive Introduction MegaParse is a powerful and versatile document parsing tool designed to optimize data processing for the Large Language Model (LLM). Whether you are working with text, PDF, PowerPoint presentations or Word documents, MegaParse...
10mos ago
025.2K
Genesis:开源生成式物理引擎,实现基于真实物理的4D动态世界模拟

Genesis: open source generative physics engine for real physics-based 4D dynamic world simulation

General Introduction Genesis is a generative physics world designed for general purpose robotics and embodied AI learning. It provides a unified simulation platform that supports the simulation of a wide range of materials and physical phenomena.Genesis aims to unlock generative AI and physics simulation by combining...
10mos ago
025.2K
AnkiAIUtils: Anki Flashcard Learning AI Toolset, an intelligent assistant that automatically optimizes memorized cards

AnkiAIUtils: Anki Flashcard Learning AI Toolset, an intelligent assistant that automatically optimizes memorized cards

General Description AnkiAIUtils is a set of AI-enhanced tools designed for the Anki flashcard learning system. Developed by a medical student, the tool is designed to automatically improve cards that users are struggling with during the learning process through AI technology. It can intelligently provide users with personalized...
10mos ago
025.1K
NodeRAG:基于异构图的精准信息检索与生成工具

NodeRAG: A Heterogeneous Graph-Based Tool for Accurate Information Retrieval and Generation

A Comprehensive Introduction NodeRAG is an open source Retrieval Augmented Generation (RAG) system hosted on GitHub and developed by Terry-Xu-666. It optimizes information retrieval and generation through heterogeneous graph structures, significantly improving retrieval accuracy and contextual relevance.Nod...
6mos ago
025.1K
SegAnyMo:从视频中自动分割任意运动物体的开源工具

SegAnyMo: open source tool to automatically segment arbitrary moving objects from video

General Introduction SegAnyMo is an open source project developed by a team of researchers at UC Berkeley and Peking University, including members such as Nan Huang. This tool focuses on video processing and can automatically recognize and segment arbitrary moving objects in a video, such as people, animals or...
6mos ago
025.1K
Agentic Security:开源的LLM漏洞扫描工具,提供全面的模糊测试和攻击技术

Agentic Security: open source LLM vulnerability scanning tool that provides comprehensive fuzz testing and attack techniques

General Introduction Agentic Security is an open source LLM (Large Language Model) vulnerability scanning tool designed to provide developers and security professionals with comprehensive fuzz testing and attack techniques. The tool supports customized rule sets or agent-based attacks and is able to integrate LLM AP...
8mos ago
025.1K
Leffa:高保真模特虚拟试穿与人物姿势调整,Meta开源的可控人物图像生成模型

Leffa: High-fidelity model virtual fitting and character pose adjustment, Meta open source controllable character image generation model

Comprehensive Introduction Leffa is a unified framework for generating controllable character images, enabling precise manipulation of character appearance (e.g., virtual fitting) and pose (e.g., pose transfer). The framework significantly reduces distortion of fine-grained details by directing the target query to focus on the correct reference key in the attention layer, with ...
10mos ago
025.1K
Flow(Laminar):构建智能体的轻量级任务引擎,简化并灵活管理任务

Flow (Laminar): a lightweight task engine for building intelligences that simplifies and flexibly manages tasks

Comprehensive Introduction Flow is a lightweight task engine designed for building AI agents, emphasizing simplicity and flexibility. Unlike traditional node- and edge-based workflows, Flow uses a dynamic task queuing system that supports parallel execution, dynamic scheduling, and intelligent dependency management. Its core concept is ...
10mos ago
025K
Fay数字人框架:集成语言模型与3D数字角色,支持多种应用场景

Fay Digital Human Framework: Integrated language modeling and 3D digital characters to support multiple application scenarios

Comprehensive Introduction Fay is an open source 3D virtual digital human framework that integrates language models and digital characters for a variety of application scenarios, such as virtual shopping guides, virtual anchors, assistants, waiters, teachers, and voice- or text-based mobile assistants.The Fay framework supports full offline use, providing m...
9mos ago
024.9K
MedRAX: 利用多模态大模型进行胸部X光片分析的智能体

MedRAX: A Smart Body for Chest X-ray Analysis Using Multimodal Large Models

Comprehensive Introduction MedRAX is a state-of-the-art AI intelligence designed for chest radiograph (CXR) analysis. It integrates state-of-the-art CXR analysis tools and multimodal large language models to dynamically process complex medical queries without additional training.MedRAX, through its modular design...
7mos ago
024.9K
混元文生视频:生成写实镜头感的高质量视频,腾讯开源视频生成大模型

Hybrid Vincennes video: generating realistic footage sense of high-quality video, Tencent open source video generation large model

Comprehensive Introduction Tencent Mixed Yuan Text Generation Video (available in Yuanbao APP) is a video generation platform based on AI technology launched by Tencent. The platform utilizes the Tencent Mixed Yuan Big Model with powerful cross-domain knowledge and natural language understanding to generate high-quality videos based on users' text descriptions...
9mos ago
024.9K
Devika:开源的AI软件工程师智能体,能够理解、拆分指令为子任务并编写代码

Devika: open-source AI software engineer intelligence that understands, splits instructions into subtasks and writes code

General Introduction Devika is an advanced AI software engineer that understands high-level human instructions, breaks them down into steps, studies the relevant information, and writes code to achieve a given goal. It intelligently develops software using large-scale language models, planning and reasoning algorithms, and web browsing capabilities.D...
7mos ago
024.9K
Deep Recall:为大模型提供企业级记忆框架的开源工具

Deep Recall: an open source tool that provides an enterprise-class memory framework for large models

Comprehensive Introduction Deep Recall is an open source, enterprise-class memory framework designed for large-scale language models (LLMs). It provides hyper-personalized responsiveness through efficient contextual retrieval and integration. The framework uses a three-tier architecture, including a memory service, a reasoning service, and a coordinator, supporting...
5mos ago
024.8K
AI ContentCraft:生成短故事、对话脚本、配音、配图的多功能AI内容创作工具

AI ContentCraft: a versatile AI content creation tool for generating short stories, dialog scripts, voiceovers, and graphics

General Introduction AI ContentCraft is a versatile content creation tool that integrates text generation, speech synthesis, image generation and more. It helps creators quickly generate stories, podcast scripts, and accompanying audio and video content. The tool supports multiple language conversions and can batch...
9mos ago
024.8K
EchoMimic:音频驱动人像照片生成说话视频(EchoMimicV2加速版安装包)

EchoMimic: Audio-driven portrait photos to generate talking videos (EchoMimicV2 accelerated installer)

General Introduction EchoMimic is an open source project designed to generate realistic portrait animations through audio-driven generation. Developed by Ant Group's Terminal Technologies division, the project utilizes editable marker point conditions to generate dynamic portrait videos using a combination of audio and facial marker points.EchoMimic...
9mos ago
024.8K
Paper2Code:将机器学习论文自动转化为可运行代码

Paper2Code: Automatically Converting Machine Learning Papers into Runnable Code

General Introduction Paper2Code is an open source project that aims to solve the problem of lack of code implementations for machine learning papers. It automatically transforms scientific papers into runnable code repositories through the multi-agent Large Language Modeling (LLM) system PaperCoder. The system uses planning ...
5mos ago
024.8K
CogAgent:智谱开源的智能视觉语言模型,实现图形界面自动化操作

CogAgent: Smart Spectrum's open source intelligent visual language model for automating graphical interfaces

Comprehensive Introduction CogAgent is an open source visual language model developed by Tsinghua University Data Mining Research Group (THUDM), aiming to automate the operation of cross-platform graphical user interface (GUI). The model is based on CogVLM (GLM-4V-9B) and supports bilingual Chinese and English...
10mos ago
024.7K
AnimeGamer:用语言指令生成动漫视频和角色互动的开源工具

AnimeGamer: An Open Source Tool for Generating Anime Videos and Character Interactions with Language Commands

AnimeGamer is an open source tool launched by Tencent ARC Lab. Users can generate anime videos with simple language commands, such as "Sousuke drive around in a purple car", as well as allow different anime characters to interact with each other, such as Kiki from The Witch's House, and Sky City...
6mos ago
024.7K
opensource_notebooklm:基于Deepseek-V3和PlayHT TTS的NotebookLM开源实现

opensource_notebooklm: open source implementation of NotebookLM based on Deepseek-V3 and PlayHT TTS

General Introduction Open Source NotebookLM is an innovative artificial intelligence project that combines Deepseek-V3's language understanding capabilities with PlayHT's speech synthesis technology, aiming to create an intelligent note-taking conversation system. The project was developed by Build Fast w...
9mos ago
024.7K
AnyText:生成和编辑多语言图像文本,高可控在图像中生成多行中文

AnyText: Generate and edit multi-language image text, highly controllable to generate multiple lines of Chinese in the image

Comprehensive Introduction AnyText is a revolutionary multilingual visual text generation and editing tool developed based on the diffusion model. It generates natural, high-quality multilingual text in images and supports flexible text editing features. It was developed by a team of researchers and presented at ICLR 2024...
10mos ago
024.7K
Harbor:一键部署本地LLM开发环境,轻松管理和运行AI服务的容器化工具集

Harbor: a containerized toolset for easily managing and running AI services with one-click deployment of local LLM development environments

Comprehensive Introduction Harbor is a revolutionary containerized LLM toolset focused on simplifying the deployment and management of local AI development environments. It enables developers with a clean command line interface (CLI) and companion application to launch and manage with a single click, including LLM backends, API interfaces, front...
9mos ago
024.7K
Sana:快速生成高分辨率图像,0.6B超小尺寸模型,低配笔记本GPU运行

Sana: fast generation of high-resolution images, 0.6B ultra-small size model, low-profile laptop GPU operation

General Introduction Sana is an efficient high-resolution image generation framework developed by NVIDIA Labs, capable of generating images up to 4096 × 4096 resolution in a matter of seconds.Sana utilizes a linear diffusion transformer and deep compression self-encoder technology to significantly...
11mos ago
024.6K
Morphic:AI驱动的开源搜索引擎,提供智能问答、视频搜索、生成UI代码

Morphic: AI-powered open-source search engine that offers smart Q&A, video search, and generates UI code

General Introduction Morphic is a search engine based on AI technology with a generative user interface designed to provide intelligent Q&A and an efficient search experience. Users can perform a variety of searches with Morphic, including text, video, etc., and can save search history and share search results.Mo...
11mos ago
024.6K
STORM:基于Topic搜索网络数据,生成带引用的论文、长文报告

STORM: Search web data based on Topic to generate papers with citations, long paper reports

General Introduction STORM is a knowledge integration and article generation system developed by the Oval team at Stanford University. It focuses on generating exhaustive Wikipedia-like articles (systematic papers) from scratch. The system utilizes large-scale language models for topic research, preparing synopses and simulating actual interconnected...
7mos ago
024.6K
LangGraph Supervisor:利用监督智能体来管理多智能体协作的工具

LangGraph Supervisor: a tool for managing multi-intelligence collaboration using supervising intelligences

Comprehensive Introduction LangGraph Supervisor is a Python library based on the LangGraph framework, designed for creating and managing multi-intelligent body systems. The library coordinates the work of multiple specialized agents through a central supervisory agent, ensuring that communication flows and tasks are divided...
8mos ago
024.6K
RAG Web UI:构建智能文档问答系统,简单构建私有Web端知识库

RAG Web UI: Building an Intelligent Documentation Q&A System and Simply Building a Private Web-Side Knowledge Base

Comprehensive Introduction RAG Web UI is an intelligent dialog system based on RAG (Retrieval Augmented Generation) technology. It helps organizations and individuals build intelligent Q&A systems based on their own knowledge base. By combining document retrieval and large language modeling, RAG Web UI provides accurate and reliable...
9mos ago
024.5K