AI open source project

Total 1020 articles posts
AI投资系统:自动化A股投资决策系统,利用多智能体系统分析市场数据

AI investment system: automated A-share investment decision-making system that utilizes a multi-intelligence system to analyze market data

Comprehensive Introduction A_Share_investment_Agent is an A-share investment decision aid based on a multi-intelligence system. The system is designed to analyze market data, calculate the intrinsic value of stocks, analyze market sentiment, and fundamental data through multiple collaborative intelligences to...
12mos ago
048.7K
UltraRAG:一站式RAG系统解决方案,简化数据构建与模型微调

UltraRAG: A One-Stop RAG System Solution to Simplify Data Construction and Model Fine-Tuning

Comprehensive Introduction UltraRAG is a RAG (Retrieval Augmented Generation) system solution jointly proposed by the THUNLP group at Tsinghua University, the NEUIR group at Northeastern University, Modelbest.Inc and the 9#AISoft team. The framework is based on agile deployment and modularized building...
12mos ago
048.6K
MiniRAG:简化检索增强生成框架,实体图索引召回相关文本块

MiniRAG: Simplified Retrieval Enhanced Generation Framework, Entity Graph Index Recall Relevant Text Blocks

Comprehensive Introduction MiniRAG is an extremely simple Retrieval Augmented Generation (RAG) framework that aims to enable good RAG performance even for small models through heterogeneous graph indexing and lightweight topology-enhanced retrieval. It is developed by the Data Science Laboratory of the University of Hong Kong (HKUDS) to address ...
12mos ago
048.5K
InvSR:开源图像超分辨率项目,提升图像分辨率质量

InvSR: Open source image super-resolution project to improve the quality of image resolution

General Introduction InvSR is an innovative open-source image super-resolution project based on diffusion inversion techniques capable of converting low-resolution images into high-quality, high-resolution images. The project utilizes the rich a priori knowledge of images embedded in pre-trained large-scale diffusion models to support, through a flexible sampling mechanism, the...
1yrs ago
048.4K
Amurex:开源AI会议记录助手,自动记录会议内容生成总结

Amurex: open source AI meeting recording assistant, automatic recording of meeting content to generate summaries

General Introduction Amurex is an open source AI meeting assistant developed by The Personal AI Company that aims to improve meeting efficiency through intelligent features.Amurex can provide real-time suggestions, generate intelligent summaries, record meeting content, and automatically send follow...
1yrs ago
048.4K
sensitive-word:敏感词过滤工具,高效DFA算法实现

sensitive-word: sensitive word filtering tool, efficient DFA algorithm implementation

Comprehensive introduction Sensitive Word Filtering Tool (Sensitive Word) is a high-performance Java sensitive word filtering tool based on the implementation of the DFA algorithm framework . The tool is able to efficiently detect and filter sensitive words , supports a variety of format conversion and custom replacement strategies. Its design goal is to provide ...
1yrs ago
048.4K
ModelBest(面壁智能):全球领先的轻量高性能端侧大模型

ModelBest: The World's Leading Lightweight, High-Performance End-Side Big Model

General Introduction ModelBest is a company specializing in developing lightweight and high-performance large models, dedicated to applying advanced AI technologies to mainstream consumer electronics and various end devices in daily life. Its MiniCPM series of end-side models are characterized by extreme arithmetic power and memory usage efficiency...
1yrs ago
048.2K
AutoAgent:通过自然语言快速创建并部署AI智能体的框架

AutoAgent: a framework for rapid creation and deployment of AI intelligences through natural language

General Introduction AutoAgent is an open source AI intelligences framework developed by the Data Intelligence Laboratory of the University of Hong Kong (HKUDS) and hosted on GitHub.It allows users to rapidly create and deploy customized AI intelligences by describing their requirements in purely natural language, without any programming base...
7mos ago
048.2K
AI2SRT:利用 Gemini模型,一键为长视频创建解说短视频或视频总结

AI2SRT: Create short narrated videos or video summaries for long videos with one click using Gemini models

Comprehensive Introduction AI2SRT is an open source project that utilizes the GeminiAI Big Model to generate short narrated videos and video summaries for long videos with one click, while supporting audio and video transcription subtitles. The project aims to simplify the video content creation process and provide efficient subtitle generation and translation functions. Users can pass...
1yrs ago
048.2K
MMAudio:为视频画面生成同步音效与配乐,视频到音频的多模态联合训练工具

MMAudio: generating synchronized sound effects and soundtracks for video footage, video-to-audio multimodal co-training tool

General Introduction MMAudio is an open-source project aiming to generate high-quality synchronized audio through joint multimodal training. Developed by Ho Kei Cheng et al. at the Chinese University of Hong Kong, the project's main function is to generate synchronized audio based on video and/or text input.MM...
1yrs ago
048.2K
NodeRAG:基于异构图的精准信息检索与生成工具

NodeRAG: A Heterogeneous Graph-Based Tool for Accurate Information Retrieval and Generation

A Comprehensive Introduction NodeRAG is an open source Retrieval Augmented Generation (RAG) system hosted on GitHub and developed by Terry-Xu-666. It optimizes information retrieval and generation through heterogeneous graph structures, significantly improving retrieval accuracy and contextual relevance.Nod...
9mos ago
048.2K
Agent TARS:使用视觉和命令操作电脑的开源智能体

Agent TARS: An Open Source Intelligence Using Vision and Commands to Operate Computers

Comprehensive Introduction Agent TARS is a multimodal AI intelligence open-sourced by ByteDance.The core feature is to visually understand web content and combine command line and file system operations to help users complete complex computer tasks. Instead of requiring manual operations like traditional tools, it can self...
10mos ago
048.1K
MedRAX: 利用多模态大模型进行胸部X光片分析的智能体

MedRAX: A Smart Body for Chest X-ray Analysis Using Multimodal Large Models

Comprehensive Introduction MedRAX is a state-of-the-art AI intelligence designed for chest radiograph (CXR) analysis. It integrates state-of-the-art CXR analysis tools and multimodal large language models to dynamically process complex medical queries without additional training.MedRAX, through its modular design...
10mos ago
047.9K
SegAnyMo:从视频中自动分割任意运动物体的开源工具

SegAnyMo: open source tool to automatically segment arbitrary moving objects from video

General Introduction SegAnyMo is an open source project developed by a team of researchers at UC Berkeley and Peking University, including members such as Nan Huang. This tool focuses on video processing and can automatically recognize and segment arbitrary moving objects in a video, such as people, animals or...
10mos ago
047.6K
Ant Design X:快速构建AI聊天界面的工具包,支持模型集成和数据流管理。

Ant Design X: A toolkit for rapidly building AI chat interfaces with support for model integration and data flow management.

Comprehensive Introduction Ant Design X is a toolkit open-sourced by Ant Group, designed to help developers quickly build AI-driven dialog interfaces. It provides a rich set of components and templates, supports model integration compatible with OpenAI standards, and is suitable for a variety of applications such as intelligent customer service, AI assistants, and other...
1yrs ago
047.5K
MegaParse:解析各类型文档为LLM可用数据,完整保留文档中的表格、图片等所有信息

MegaParse: parses all types of documents into LLM-available data, preserving all information in the document such as tables, pictures, etc. in its entirety

Comprehensive Introduction MegaParse is a powerful and versatile document parsing tool designed to optimize data processing for the Large Language Model (LLM). Whether you are working with text, PDF, PowerPoint presentations or Word documents, MegaParse...
1yrs ago
047.5K
Hibiki:实时语音翻译模型,保留原声特点的流式翻译

Hibiki: a real-time speech translation model, streaming translation that preserves the characteristics of the original voice

General Introduction Hibiki is a high-fidelity real-time speech translation model developed by Kyutai Labs. Unlike traditional offline translation, Hibiki is able to generate natural speech translation in the target language and provide text translation in real time while the user is speaking. The model...
11mos ago
047.4K
opensource_notebooklm:基于Deepseek-V3和PlayHT TTS的NotebookLM开源实现

opensource_notebooklm: open source implementation of NotebookLM based on Deepseek-V3 and PlayHT TTS

General Introduction Open Source NotebookLM is an innovative artificial intelligence project that combines Deepseek-V3's language understanding capabilities with PlayHT's speech synthesis technology, aiming to create an intelligent note-taking conversation system. The project was developed by Build Fast w...
1yrs ago
047.3K
OpenAOE:大模型群聊框架:同时与多个大语言模型聊天

OpenAOE: Large Model Group Chat Framework: Chatting with Multiple Large Language Models Simultaneously

Comprehensive Introduction OpenAOE is an open source large model group chat framework, aiming to solve the problem of the lack of chat frameworks in the current market with multiple models responding in parallel. With OpenAOE, users can talk to multiple Large Language Models (LLMs) at the same time and get parallel output. The framework supports ...
11mos ago
047.2K
Flow(Laminar):构建智能体的轻量级任务引擎,简化并灵活管理任务

Flow (Laminar): a lightweight task engine for building intelligences that simplifies and flexibly manages tasks

Comprehensive Introduction Flow is a lightweight task engine designed for building AI agents, emphasizing simplicity and flexibility. Unlike traditional node- and edge-based workflows, Flow uses a dynamic task queuing system that supports parallel execution, dynamic scheduling, and intelligent dependency management. Its core concept is ...
1yrs ago
047.2K
AppAgent:利用多模态智能体自动操作智能手机

AppAgent: automated smartphone operation using multimodal intelligences

Comprehensive Introduction AppAgent is a large language model (LLM)-based multimodal agent framework designed to manipulate smartphone applications. The framework mimics human interactions such as taps and swipes through a simplified manipulation space, thus eliminating the need for system back-end access and extending its use across different app...
1yrs ago
047K
飞桨 PP-TableMagic:复杂表格结构化信息提取神器

Flying Paddle PP-TableMagic: Structured Information Extraction for Complex Tables

The goal of table recognition is to parse tables in images, accurately identify table structures and cell locations, and reduce them to structured table formats (e.g., HTML). In today's information age, a large amount of important tabular data still exists in an unstructured state (e.g., scanned documents with pictures of statistical tables...).
10mos ago
046.9K
Sana:快速生成高分辨率图像,0.6B超小尺寸模型,低配笔记本GPU运行

Sana: fast generation of high-resolution images, 0.6B ultra-small size model, low-profile laptop GPU operation

General Introduction Sana is an efficient high-resolution image generation framework developed by NVIDIA Labs, capable of generating images up to 4096 × 4096 resolution in a matter of seconds.Sana utilizes a linear diffusion transformer and deep compression self-encoder technology to significantly...
1yrs ago
046.8K
Easegen:开源数字人课程制作平台,PPT一键生成克隆数字人讲解视频

Easegen: open source digital human course production platform, PPT one-click generation cloning digital human lecture video

Comprehensive Introduction Easegen is an open source digital human course creation platform that aims to improve the efficiency of teaching content production and management through AI technology. The platform provides a one-stop solution from course production, video management to intelligent questioning, which allows users to create digital human-explained video courses...
1yrs ago
046.6K
CogAgent:智谱开源的智能视觉语言模型,实现图形界面自动化操作

CogAgent: Smart Spectrum's open source intelligent visual language model for automating graphical interfaces

Comprehensive Introduction CogAgent is an open source visual language model developed by Tsinghua University Data Mining Research Group (THUDM), aiming to automate the operation of cross-platform graphical user interface (GUI). The model is based on CogVLM (GLM-4V-9B) and supports bilingual Chinese and English...
1yrs ago
046.5K
Paper2Code:将机器学习论文自动转化为可运行代码

Paper2Code: Automatically Converting Machine Learning Papers into Runnable Code

General Introduction Paper2Code is an open source project that aims to solve the problem of lack of code implementations for machine learning papers. It automatically transforms scientific papers into runnable code repositories through the multi-agent Large Language Modeling (LLM) system PaperCoder. The system uses planning ...
8mos ago
046.5K
SmartRead:自动标注技术PDF文档并提供相关引用源

SmartRead: Automatically annotate technical PDF documents and provide relevant citation sources

Comprehensive Introduction SmartRead is an AI-based open source tool designed for technical documents. It can automatically analyze PDF files, mark key content, such as important terms, titles or core ideas to help users quickly understand complex documents. At the same time, it can also provide with the main document...
10mos ago
046.4K
ExtractThinker:提取和分类文档为结构化数据,优化文档处理流程

ExtractThinker: extracting and classifying documents into structured data to optimize the document processing flow

Comprehensive Introduction ExtractThinker is a flexible document intelligence tool that utilizes Large Language Models (LLMs) to extract and classify structured data from documents, providing a seamless ORM-like document processing workflow. It supports a variety of document loaders, including Tess...
1yrs ago
046.4K
BuffGPT:企业级生成式AI应用低代码开发平台

BuffGPT: A Low-Code Development Platform for Enterprise-Grade Generative AI Applications

Comprehensive Introduction BuffGPT is an open source AI application development platform based on the Large Language Model (LLM), providing out-of-the-box features such as data processing, model invocation, RAG retrieval, and visual workflow orchestration to help users easily build and operate generative AI applications. The platform supports privatization...
10mos ago
046.3K
Memora:构建人性化AI记忆模块,保存并更新与人类的互动信息

Memora: building humanized AI memory modules to save and update information about interactions with humans

General Introduction Memora is an agent designed to replicate human memories for each personalized AI. It helps AIs remember details of past interactions, emotions, and shared experiences just like humans do through features like timestamped memories, emotion markers, and multimodal memories.Memora supports multi-tenancy and is capable of handling...
1yrs ago
046K