AI open source project

Total 1020 articles posts
Deep Recall:为大模型提供企业级记忆框架的开源工具

Deep Recall: an open source tool that provides an enterprise-class memory framework for large models

Comprehensive Introduction Deep Recall is an open source, enterprise-class memory framework designed for large-scale language models (LLMs). It provides hyper-personalized responsiveness through efficient contextual retrieval and integration. The framework uses a three-tier architecture, including a memory service, a reasoning service, and a coordinator, supporting...
1yrs ago
069.5K
SmartRead:自动标注技术PDF文档并提供相关引用源

SmartRead: Automatically annotate technical PDF documents and provide relevant citation sources

Comprehensive Introduction SmartRead is an AI-based open source tool designed for technical documents. It can automatically analyze PDF files, mark key content, such as important terms, titles or core ideas to help users quickly understand complex documents. At the same time, it can also provide with the main document...
1yrs ago
069.4K
MindSearch:开源AI搜索引擎框架,部署您自己的 Perplexity 搜索引擎!

MindSearch: open source AI search engine framework to deploy your own Perplexity search engine!

Comprehensive Introduction MindSearch is an open source AI search engine framework launched by Shanghai Artificial Intelligence Laboratory (SAL), aiming to simulate human thought process for complex information gathering and integration. The tool combines the advanced technology of large-scale language modeling (LLM) and search engine through multi-intelligence...
1yrs ago
069.3K
AnimeGamer:用语言指令生成动漫视频和角色互动的开源工具

AnimeGamer: An Open Source Tool for Generating Anime Videos and Character Interactions with Language Commands

AnimeGamer is an open source tool launched by Tencent ARC Lab. Users can generate anime videos with simple language commands, such as "Sousuke drive around in a purple car", as well as allow different anime characters to interact with each other, such as Kiki from The Witch's House, and Sky City...
1yrs ago
069.3K
Paper2Code:将机器学习论文自动转化为可运行代码

Paper2Code: Automatically Converting Machine Learning Papers into Runnable Code

General Introduction Paper2Code is an open source project that aims to solve the problem of lack of code implementations for machine learning papers. It automatically transforms scientific papers into runnable code repositories through the multi-agent Large Language Modeling (LLM) system PaperCoder. The system uses planning ...
1yrs ago
069.2K
XRAG:优化检索增强生成系统的可视化评估工具

XRAG: A Visual Evaluation Tool for Optimizing Retrieval Enhancement Generation Systems

Comprehensive Introduction XRAG (eXamining the Core) is a benchmarking framework designed for evaluating the underlying components of advanced retrieval augmentation generation (RAG) systems. By profiling and analyzing each core module, XRAG provides information on how different configurations and components affect RAG...
1yrs ago
069.1K
MegaParse:解析各类型文档为LLM可用数据,完整保留文档中的表格、图片等所有信息

MegaParse: parses all types of documents into LLM-available data, preserving all information in the document such as tables, pictures, etc. in its entirety

Comprehensive Introduction MegaParse is a powerful and versatile document parsing tool designed to optimize data processing for the Large Language Model (LLM). Whether you are working with text, PDF, PowerPoint presentations or Word documents, MegaParse...
2yrs ago
069K
opensource_notebooklm:基于Deepseek-V3和PlayHT TTS的NotebookLM开源实现

opensource_notebooklm: open source implementation of NotebookLM based on Deepseek-V3 and PlayHT TTS

General Introduction Open Source NotebookLM is an innovative artificial intelligence project that combines Deepseek-V3's language understanding capabilities with PlayHT's speech synthesis technology, aiming to create an intelligent note-taking conversation system. The project was developed by Build Fast w...
1yrs ago
068.9K
RLAMA:命令行操作的本地文档智能问答 RAG 系统

RLAMA: A RAG System for Intelligent Quizzing of Local Documents Operated from the Command Line

Comprehensive Introduction RLAMA is a document intelligent Q&A RAG (Retrieval Augmentation Generation) system developed open-source by DonTizi and hosted on GitHub, whose core feature lies in the realization of functionality through command line operations. Users can use simple terminal commands to connect to local ...
1yrs ago
068.9K
Sketch-Gen:生成高质量线稿和草图,反推图像提示词,一键安装包

Sketch-Gen: Generate high-quality line drawings and sketches, backpropagate image cue words, one-click package installation

General Introduction Sketch-Gen is an AI technology-based line drawing and sketch generation tool designed to help artists and designers quickly generate high-quality line drawings and sketches. The tool is derived from the Paints-UNDO project and utilizes advanced machine learning models that can...
2yrs ago
068.7K
Devika:开源的AI软件工程师智能体,能够理解、拆分指令为子任务并编写代码

Devika: open-source AI software engineer intelligence that understands, splits instructions into subtasks and writes code

General Introduction Devika is an advanced AI software engineer that understands high-level human instructions, breaks them down into steps, studies the relevant information, and writes code to achieve a given goal. It intelligently develops software using large-scale language models, planning and reasoning algorithms, and web browsing capabilities.D...
1yrs ago
068.6K
sensitive-word:敏感词过滤工具,高效DFA算法实现

sensitive-word: sensitive word filtering tool, efficient DFA algorithm implementation

Comprehensive introduction Sensitive Word Filtering Tool (Sensitive Word) is a high-performance Java sensitive word filtering tool based on the implementation of the DFA algorithm framework . The tool is able to efficiently detect and filter sensitive words , supports a variety of format conversion and custom replacement strategies. Its design goal is to provide ...
2yrs ago
068.4K
Leffa:高保真模特虚拟试穿与人物姿势调整,Meta开源的可控人物图像生成模型

Leffa: High-fidelity model virtual fitting and character pose adjustment, Meta open source controllable character image generation model

Comprehensive Introduction Leffa is a unified framework for generating controllable character images, enabling precise manipulation of character appearance (e.g., virtual fitting) and pose (e.g., pose transfer). The framework significantly reduces distortion of fine-grained details by directing the target query to focus on the correct reference key in the attention layer, with ...
1yrs ago
068K
Agentic Security:开源的LLM漏洞扫描工具,提供全面的模糊测试和攻击技术

Agentic Security: open source LLM vulnerability scanning tool that provides comprehensive fuzz testing and attack techniques

General Introduction Agentic Security is an open source LLM (Large Language Model) vulnerability scanning tool designed to provide developers and security professionals with comprehensive fuzz testing and attack techniques. The tool supports customized rule sets or agent-based attacks and is able to integrate LLM AP...
1yrs ago
067.7K
Maxun:开源无代码平台,自动抓取网页数据并转换为API或电子表格

Maxun: open source no-code platform that automatically crawls web data and converts it to APIs or spreadsheets

Comprehensive Introduction Maxun is an open source no-code web data extraction platform that allows users to train robots in minutes to automatically crawl web data and convert it into APIs or spreadsheets. The platform supports paging and scrolling, can adapt to changes in website layout, provides powerful data crawling...
1yrs ago
067.7K
Aggregator:一站式代理爬取与聚合平台,免费代理池(请合规使用)

Aggregator: one-stop agent crawling and aggregation platform, free agent pool (please use in compliance)

Comprehensive introduction Aggregator is an open source project aimed at creating a free proxy pool that can crawl a variety of available proxy nodes. The platform has a flexible plug-in system , the user can according to the special needs of the target site , through plug-ins to achieve specific functions . The project is mainly used to learn to crawl ...
2yrs ago
067.6K
Story-Adapter:根据长篇故事生成连续且风格一致的图像插画

Story-Adapter: generating continuous and consistent graphic illustrations based on a long story

General Introduction Story-Adapter is an innovative story visualization framework that converts textual stories into coherent image sequences. Developed by researchers, this project employs an iterative approach that requires no training to generate high-quality story illustrations. The framework is characterized by its ability to handle long...
1yrs ago
067.6K
Memora:构建人性化AI记忆模块,保存并更新与人类的互动信息

Memora: building humanized AI memory modules to save and update information about interactions with humans

General Introduction Memora is an agent designed to replicate human memories for each personalized AI. It helps AIs remember details of past interactions, emotions, and shared experiences just like humans do through features like timestamped memories, emotion markers, and multimodal memories.Memora supports multi-tenancy and is capable of handling...
1yrs ago
067.6K
AppAgent:利用多模态智能体自动操作智能手机

AppAgent: automated smartphone operation using multimodal intelligences

Comprehensive Introduction AppAgent is a large language model (LLM)-based multimodal agent framework designed to manipulate smartphone applications. The framework mimics human interactions such as taps and swipes through a simplified manipulation space, thus eliminating the need for system back-end access and extending its use across different app...
1yrs ago
067.5K
VideoSeal:先进的开源视频隐藏水印嵌入与提取工具,保护视频版权

VideoSeal: Advanced open source video hidden watermark embedding and extraction tools to protect video copyrights

General Introduction VideoSeal is an open source video watermarking tool developed by Facebook Research, designed to provide efficient video watermark embedding and extraction. The tool supports the latest open source models and contains pre-trained models, training code, inference code and evaluation tools...
1yrs ago
067.3K
EchoMimic:音频驱动人像照片生成说话视频(EchoMimicV2加速版安装包)

EchoMimic: Audio-driven portrait photos to generate talking videos (EchoMimicV2 accelerated installer)

General Introduction EchoMimic is an open source project designed to generate realistic portrait animations through audio-driven generation. Developed by Ant Group's Terminal Technologies division, the project utilizes editable marker point conditions to generate dynamic portrait videos using a combination of audio and facial marker points.EchoMimic...
1yrs ago
067.3K
Search o1:赋予推理模型主动搜索能力,让大模型边思考边搜索外部知识

Search o1: Empowering inference models to actively search for external knowledge while the larger model is thinking

Comprehensive Introduction Search-o1 is an open source project that aims to enhance the performance of large-scale reasoning models (LRMs) by integrating advanced search mechanisms. The core idea is to solve the knowledge deficit problem encountered in the reasoning process through dynamic search and knowledge integration. The project was developed by sunn...
1yrs ago
067.2K
ExtractThinker:提取和分类文档为结构化数据,优化文档处理流程

ExtractThinker: extracting and classifying documents into structured data to optimize the document processing flow

Comprehensive Introduction ExtractThinker is a flexible document intelligence tool that utilizes Large Language Models (LLMs) to extract and classify structured data from documents, providing a seamless ORM-like document processing workflow. It supports a variety of document loaders, including Tess...
1yrs ago
067.1K
VideoChat:自定义形象和音色克隆的实时语音交互数字人,支持端到端语音方案和级联方案

VideoChat: real-time voice-interactive digital person with customized image and tone cloning, supporting end-to-end voice solutions and cascading solutions

Comprehensive Introduction VideoChat is a real-time voice interaction digital person project based on open source technology, supporting both end-to-end voice schemes (GLM-4-Voice - THG) and cascade schemes (ASR-LLM-TTS-THG). The project allows users to customize the digital ...
2yrs ago
066.9K
Agenta:集成到AI应用的提示词与模型效果评估工具

Agenta: a tool for evaluating the effectiveness of cue words and models integrated into AI applications

Comprehensive Introduction Agenta is an open source AI model management tool specialized in helping users easily experiment with cue words, test model effects and monitor runs. It is suitable for people who want to develop AI applications quickly, providing a platform that is simple to operate. You can use it to try the effect of different cue words on...
1yrs ago
066.9K