AI open source project

Total 1020 articles posts
MegaTTS3:合成中英文语音的轻量模型

MegaTTS3: A Lightweight Model for Synthesizing Chinese and English Speech

Comprehensive Introduction MegaTTS3 is an open source speech synthesis tool developed by ByteDance in cooperation with Zhejiang University, focusing on generating high-quality Chinese and English speech. Its core model is only 0.45B parameters , lightweight and efficient , support for mixed Chinese and English speech generation and speech cloning . The project is hosted on ...
5mos ago
02.6K
Research Rabbit:使用本地LLM进行网页研究和报告撰写,自动深入用户指定主题并生成总结。

Research Rabbit: Web research and report writing using native LLM, automatically drilling down into user-specified topics and generating summaries.

General Introduction Research Rabbit is a native LLM (Large Language Model) based web research and summarization assistant. After the user provides a research topic, Research Rabbit generates a search query, obtains relevant web results, and summarizes those results...
4mos ago
02.6K
DisPose:生成人体姿态精准控制的视频,创作跳舞的小姐姐

DisPose: generating videos with precise control of human posture, creating dancing ladies

General Introduction DisPose is an innovative open source artificial intelligence project focused on controlled character image animation generation. Developed by a team of researchers and open-sourced on GitHub, the project uses advanced deep learning techniques to achieve precise character animation control by decomposing skeletal pose information.D...
8mos ago
02.6K
ColorFlow:漫画着色,黑白图像自动着色,提升图像色彩一致性和质量

ColorFlow: Comic book coloring, automatic coloring of black and white images to improve image color consistency and quality

Comprehensive Introduction ColorFlow is an image sequence auto-coloring tool developed by Tencent's ARC team to solve the problem of auto-coloring black and white image sequences. The tool utilizes a retrieval-enhanced coloring pipeline to accurately generate the colors of various elements through a pool of reference images, including the character's hair color and service...
8mos ago
02.6K
dsRAG:用于处理非结构化数据和复杂查询的检索引擎

dsRAG: A Retrieval Engine for Unstructured Data and Complex Queries

Comprehensive Introduction dsRAG is a high-performance retrieval engine designed to handle complex queries on unstructured data. It performs particularly well in handling challenging queries in dense text such as financial reports, legal documents, and academic papers. dsRAG employs three key approaches to improve performance: language...
6mos ago
02.6K
MOFA Video:运动场适配技术将静态图像转换为视频

MOFA Video: Motion Field Adaptation Technology Converts Still Images to Video

General Introduction MOFA-Video is a state-of-the-art image animation generation tool that utilizes generative motion field adaptation techniques to convert static images into dynamic videos. The project was developed in collaboration with the University of Tokyo and Tencent AI Lab, and will be presented at the 2024 European Conference on Computer Vision (E...
7mos ago
02.6K
Sim Studio:开源的AI代理工作流构建工具

Sim Studio: open source workflow builder for AI agents

Comprehensive Introduction Sim Studio is an open source AI agent workflow building platform focused on helping users quickly design, test, and deploy large-scale language model (LLM) workflows through a lightweight, intuitive visual interface. Users can create complex workflows without deep programming by dragging and dropping...
3mos ago
02.6K
Rankify:支持信息检索与重排序的Python工具包

Rankify: a Python toolkit supporting information retrieval and reordering

General Introduction Rankify is an open source Python toolkit developed by the Data Science Group at the University of Innsbruck, Austria. It focuses on information retrieval, reordering and retrieval augmentation generation (RAG), providing a unified framework. The toolkit comes with a built-in set of 40 pre-retrieved benchmarks...
5mos ago
02.6K
AIEvo:创建多智能体协作应用的高效框架

AIEvo: An Efficient Framework for Creating Multi-Intelligent Collaborative Applications

General Introduction AIEvo is Ant Group's open source multi-agent framework designed to efficiently create multi-agent applications. The framework strictly follows the SOP task graph to improve the execution success rate of complex tasks , and through feedback and monitoring mechanisms to ensure high flexibility and scalability.AIEvo has been produced within Ant Group ...
7mos ago
02.6K
Vexa:实时会议转录与智能知识提取工具

Vexa: a real-time meeting transcription and intelligent knowledge extraction tool

Comprehensive Introduction Vexa is an open source real-time meeting transcription and knowledge management platform designed to provide efficient meeting recording and intelligent knowledge extraction services for enterprises and individuals. It automatically joins platforms such as Google Meet, Zoom, etc. through API-driven meeting robots...
4mos ago
02.6K
Swarms:多智能体编排框架,企业级生产工具

Swarms: Multi-intelligent Orchestration Framework, Enterprise Production Tool

General Introduction Swarms is an enterprise-grade production-ready multi-agent orchestration framework designed to boost business productivity through efficient agent management and task processing. With support for multiple models, multiple memory systems and custom agent creation, the framework provides a modular design and comprehensive logging capabilities to ensure that the system...
8mos ago
02.6K
FinRobot:提升金融数据分析效率和投资研究的的智能体

FinRobot: An Intelligent Body to Improve Financial Data Analysis Efficiency and Investment Research

Comprehensive Introduction FinRobot is an open source AI intelligence platform developed by AI4Finance Foundation and designed for financial analytics. It not only covers traditional language models, but also incorporates a variety of AI technologies, aiming to provide a comprehensive solution for the financial industry.F...
6mos ago
02.5K
LocalPdfChatRAG:支持本地多源PDF文档问答的智能聊天工具

LocalPdfChatRAG: Intelligent Chat Tool to Support Local Multi-Source PDF Document Q&A

Comprehensive Introduction LocalPdfChatRAG is an open source project that aims to implement intelligent chat functionality by combining local PDF documents with Retrieval Augmented Generation (RAG) models. The project allows users to upload PDF documents and ask questions through natural language to get from the document to the relative ...
6mos ago
02.5K
TRV:将幻灯片/PPT和讲解备注快速生成演讲视频

TRV: Rapidly Generate Presentation Videos from Slides/PPTs and Explanatory Notes

General Introduction TRV is an open source tool, hosted on GitHub, designed to help users quickly convert slides and presentation notes into videos with narration. It automatically generates audio and video content from incoming presentation files through simple command line operations, suitable for those who need to quickly create presentations...
6mos ago
02.5K
CHRONOS:新闻时间线总结工具,提升新闻检索和时间线生成效率

CHRONOS: News Timeline Summarization Tool to Improve News Retrieval and Timeline Generation Efficiency

Comprehensive Introduction CHRONOS is a news timeline summarization tool developed by Alibaba NLP team. The tool generates timeline summaries of news events through iterative self-questioning.CHRONOS is not only capable of handling open-domain timeline summarization tasks, but also in terms of efficiency and scalability...
7mos ago
02.5K
Potpie AI:快速创建专属代码库的AI工程助手

Potpie AI: An AI engineering assistant for quickly creating proprietary code bases

Comprehensive Introduction Potpie AI is an open source platform focused on providing developers with customized AI engineering assistants. It allows AI agents to deeply understand code structure and logic and automate tasks such as debugging, testing, and code generation by building a knowledge graph of the code base. Users can use simple...
4mos ago
02.5K
LangManus:支持多智能体协作的开源AI自动化框架

LangManus: an open source AI automation framework supporting multi-intelligence collaboration

General Introduction LangManus is an open source AI automation framework hosted on GitHub. Developed by a group of former colleagues in their spare time, it is an academically-driven project with the goal of combining language models and specialized tools to accomplish web search, data crawling, and code execution...
5mos ago
02.5K
Agent TARS:使用视觉和命令操作电脑的开源智能体

Agent TARS: An Open Source Intelligence Using Vision and Commands to Operate Computers

Comprehensive Introduction Agent TARS is a multimodal AI intelligence open-sourced by ByteDance.The core feature is to visually understand web content and combine command line and file system operations to help users complete complex computer tasks. Instead of requiring manual operations like traditional tools, it can self...
5mos ago
02.5K
Crawl4LLM:为LLM预训练提供的高效网页爬取工具

Crawl4LLM: An Efficient Web Crawling Tool for LLM Pretraining

Comprehensive Introduction Crawl4LLM is an open source project jointly developed by Tsinghua University and Carnegie Mellon University, focusing on optimizing the efficiency of web crawling for pre-training of large models (LLM). It significantly reduces ineffective crawling by intelligently selecting high-quality web page data, claiming to be able to originally need to crawl 1...
6mos ago
02.5K
ChatFree(ChatAnywhere-2):使用GPT API创建的本地Copilot,支持任意窗口中补全对话

ChatFree (ChatAnywhere-2): Native Copilot created using the GPT API to support complementary conversations in any window.

General Introduction ChatFree is an open source project that aims to free users' AI apps from the constraints of browsers to run locally. Created using GPT API, Copilot is designed to support a wide range of office software such as Office, Word, WPS, and more. The project was developed by ...
8mos ago
02.5K
LLManager:智能自动化流程审批与人类审核结合的管理工具

LLManager: a management tool that combines intelligent automated process approvals with human reviews

Comprehensive Introduction LLManager is an open source intelligent approval management tool, developed based on LangChain's LangGraph framework, focused on automating the processing of approval requests while optimizing decision making with human review. It does this through semantic search, sample less learning and...
4mos ago
02.5K
Search o1:赋予推理模型主动搜索能力,让大模型边思考边搜索外部知识

Search o1: Empowering inference models to actively search for external knowledge while the larger model is thinking

Comprehensive Introduction Search-o1 is an open source project that aims to enhance the performance of large-scale reasoning models (LRMs) by integrating advanced search mechanisms. The core idea is to solve the knowledge deficit problem encountered in the reasoning process through dynamic search and knowledge integration. The project was developed by sunn...
7mos ago
02.5K