AI open source project

Total 1020 articles posts
Fay数字人框架:集成语言模型与3D数字角色,支持多种应用场景

Fay Digital Human Framework: Integrated language modeling and 3D digital characters to support multiple application scenarios

Comprehensive Introduction Fay is an open source 3D virtual digital human framework that integrates language models and digital characters for a variety of application scenarios, such as virtual shopping guides, virtual anchors, assistants, waiters, teachers, and voice- or text-based mobile assistants.The Fay framework supports full offline use, providing m...
7mos ago
03.1K
MOFA Video:运动场适配技术将静态图像转换为视频

MOFA Video: Motion Field Adaptation Technology Converts Still Images to Video

General Introduction MOFA-Video is a state-of-the-art image animation generation tool that utilizes generative motion field adaptation techniques to convert static images into dynamic videos. The project was developed in collaboration with the University of Tokyo and Tencent AI Lab, and will be presented at the 2024 European Conference on Computer Vision (E...
7mos ago
02.6K
Amurex:开源AI会议记录助手,自动记录会议内容生成总结

Amurex: open source AI meeting recording assistant, automatic recording of meeting content to generate summaries

General Introduction Amurex is an open source AI meeting assistant developed by The Personal AI Company that aims to improve meeting efficiency through intelligent features.Amurex can provide real-time suggestions, generate intelligent summaries, record meeting content, and automatically send follow...
7mos ago
03.4K
Agent Laboratory:为研究人员提供自动化代码及研究报告撰写助手

Agent Laboratory: automated code and study writing assistant for researchers

Comprehensive Introduction Agent Laboratory is an end-to-end autonomous research workflow designed to help researchers realize their research ideas. The system consists of dedicated agents driven by large language models that support the entire research workflow - from conducting literature reviews and developing plans to executing...
4mos ago
02.8K
AI投资系统:自动化A股投资决策系统,利用多智能体系统分析市场数据

AI investment system: automated A-share investment decision-making system that utilizes a multi-intelligence system to analyze market data

Comprehensive Introduction A_Share_investment_Agent is an A-share investment decision aid based on a multi-intelligence system. The system is designed to analyze market data, calculate the intrinsic value of stocks, analyze market sentiment, and fundamental data through multiple collaborative intelligences to...
7mos ago
02.6K
Company Researcher:公司研究工具,输入公司网址以获取详细研究信息

Company Researcher: A company research tool, enter a company's web address for detailed research information.

General Description Company Researcher (Company Researcher) is a free and open source tool designed to help users get a quick and comprehensive overview of any company. Simply enter the company's URL and the tool will gather comprehensive information from the web, presenting information about the organization, its products...
4mos ago
02.2K
小智 AI 聊天机器人:打造你的AI聊天伴侣,轻松实现语音对话和智能互动

Xiaozhi AI Chatbot: Build your AI chatting companion, easily realize voice conversation and intelligent interaction

Comprehensive Introduction Xiaozhi AI Chatbot is an open source project based on the ESP32 development board, designed to help users build their own AI chat companion. The project was developed by Shrimp and is mainly used for teaching purposes to help more people get started with AI hardware development and to understand how to apply large language models to real...
5mos ago
03.7K
WrenAI:对话式数据分析AI助手,直接获取答案、SQL查询与分析报表

WrenAI: Conversational Data Analytics AI Assistant with Direct Access to Answers, SQL Queries & Analytics Reports

General Introduction WrenAI is an open source SQL AI assistant specifically designed to help data teams, product teams and business teams gain data insights through natural language conversations. It is capable of converting natural language into SQL queries, generating charts, spreadsheets and reports, supporting multilingual...
7mos ago
03.4K
VITA:开源视觉与语音实时交互的多模态大语言模型

VITA: Open Source Multimodal Large Language Model for Real-Time Interaction between Vision and Speech

General Introduction VITA is a leading open source interactive multimodal large language modeling project, pioneering the ability to achieve true full multimodal interaction. The project launched VITA-1.0 in August 2024, pioneering the first open source interactive fully-modal large language model.2024...
7mos ago
03.1K
TransRouter:基于Gemini多模态模型,实时中英互译的音频转换工具

TransRouter: A Real-Time Audio Conversion Tool for Chinese-to-English Translation Based on Gemini Multimodal Modeling

TransRouter is a real-time voice translation tool based on Google's Gemini model, specifically designed for real-time voice translation between English and Chinese. The tool can be seamlessly integrated into video conferencing software such as Zoom, providing an easy way for cross-language...
7mos ago
03.1K
opensource_notebooklm:基于Deepseek-V3和PlayHT TTS的NotebookLM开源实现

opensource_notebooklm: open source implementation of NotebookLM based on Deepseek-V3 and PlayHT TTS

General Introduction Open Source NotebookLM is an innovative artificial intelligence project that combines Deepseek-V3's language understanding capabilities with PlayHT's speech synthesis technology, aiming to create an intelligent note-taking conversation system. The project was developed by Build Fast w...
7mos ago
02.7K
Vision is All You Need:使用视觉语言模型构建智能文档检索系统(Vision RAG)

Vision is All You Need: Building an Intelligent Document Retrieval System Using Visual Language Models (Vision RAG)

Comprehensive Introduction Vision-is-all-you-need is an innovative visual RAG (Retrieval Augmented Generation) system demonstration project that breaks new ground in applying Visual Language Modeling (VLM) to the document processing domain. Unlike traditional text chunking methods, the system directly makes...
7mos ago
03.2K
Diffbot GraphRAG LLM:依赖外部实时知识图谱数据的LLM推理服务

Diffbot GraphRAG LLM: LLM reasoning service relying on external real-time knowledge graph data

Comprehensive Introduction Diffbot LLM Reasoning Server is an innovative large-scale language modeling system with special optimizations and improvements based on the LLama model architecture. The most important feature of the project is the integration of real-time Knowledge Graph with retrieval-enhanced generation...
7mos ago
02.9K
MetaGPT:多智能体协作框架,构建 AI 软件开发团队实现自然语言编程

MetaGPT: A Multi-Intelligence Collaboration Framework for Building AI Software Development Teams for Natural Language Programming

Comprehensive Introduction MetaGPT is an innovative multi-intelligence body framework designed to model the operations of a complete AI software company. Created by geekan (Alexander Wu), the goal of the project is to combine GPT models with different roles into a collaborative entity...
5mos ago
03.5K
Fish Agent:端到端AI语音克隆助手,实时语音对话助理,Fish Speech衍生项目

Fish Agent: end-to-end AI voice cloning assistant, real-time voice conversation assistant, Fish Speech spin-off project

Comprehensive Introduction Fish Speech Derivative Project Fish Agent is a revolutionary end-to-end AI speech cloning system developed based on the V0.1 3B model architecture. As a fully end-to-end speech clone processing system, its most important feature is the use of innovative speechless...
7mos ago
03.2K
FunClip:智能剪辑视频内容为短片,轻松实现精准视频片段提取/裁剪

FunClip: Intelligent editing of video content into short clips, easy to realize accurate video clip extraction/cropping

Comprehensive Introduction FunClip is a fully open source localized automatic video editing tool developed by TONGYI Speech Lab of Alibaba Dharma Institute. The tool integrates the industrial-grade Paraformer-Large speech recognition model, which can accurately recognize the speech in the video...
7mos ago
03.7K
Dify-WebUI:基于Dify API的桌面智能对话客户端,提供企业级AI对话能力

Dify-WebUI: Desktop Intelligent Conversation Client based on Dify API, providing enterprise-grade AI conversation capabilities

Comprehensive Introduction Dify-WebUI is a modern desktop smart conversation app based on the Dify API, designed to provide enterprises with powerful AI conversation capabilities. The application supports a variety of preset theme colors to meet the personalized needs of enterprises, and has a knowledge base management function to support...
7mos ago
03.5K
小红书AI运营助手:自动生成和发布小红书文章

Xiaohongshu AI operation assistant: automatically generate and publish Xiaohongshu articles

Comprehensive Introduction Xiaohongshu AI Operation Assistant (xhsaipublisher) is an automation tool designed for publishing articles on the Xiaohongshu platform. The program combines a graphical user interface with automation scripts that utilize big model technology to generate content and automatically log in and publish via browser...
7mos ago
03.8K
Orchestra: Building Smart AI Teams for Easier and More Efficient Multi-Intelligence Collaborative Development

Orchestra: Building Smart AI Teams for Easier and More Efficient Multi-Intelligence Collaborative Development

Comprehensive Introduction Orchestra is an innovative lightweight Python framework that focuses on building multi-intelligence collaborative systems based on the Large Language Model (LLM). It employs a unique method of arranging intelligences so that multiple AI intelligences can work together harmoniously like a symphony orchestra. By modeling ...
7mos ago
02.1K
Harbor:一键部署本地LLM开发环境,轻松管理和运行AI服务的容器化工具集

Harbor: a containerized toolset for easily managing and running AI services with one-click deployment of local LLM development environments

Comprehensive Introduction Harbor is a revolutionary containerized LLM toolset focused on simplifying the deployment and management of local AI development environments. It enables developers with a clean command line interface (CLI) and companion application to launch and manage with a single click, including LLM backends, API interfaces, front...
7mos ago
02.7K
ExtractThinker:提取和分类文档为结构化数据,优化文档处理流程

ExtractThinker: extracting and classifying documents into structured data to optimize the document processing flow

Comprehensive Introduction ExtractThinker is a flexible document intelligence tool that utilizes Large Language Models (LLMs) to extract and classify structured data from documents, providing a seamless ORM-like document processing workflow. It supports a variety of document loaders, including Tess...
7mos ago
02.7K
NeoAI:让AI接管电脑远程操作,使用自然语言控制电脑的开源项目

NeoAI: Open source project that lets AI take over remote operation of computers and control them using natural language

General Introduction NeoAI is an innovative open source AI assistant tool that allows users to easily control and manage their computers through natural language conversations. Without writing any code, users can simply use everyday conversations to find files, automate tasks, manage devices, etc.NeoAI...
7mos ago
04.1K
TryOffAnyone:从人物身上提取服装为平铺服装展示图的AI工具

TryOffAnyone: AI tool for extracting garments from a person as a tiled garment display image

Comprehensive Introduction TryOffAnyone is a breakthrough AI image processing tool specialized in solving the challenges of clothing display in the e-commerce field. It is able to intelligently convert photos of clothes in real people's wearing state into lay-flat display effect images, this technology is based on the latest Latent Dif...
7mos ago
02.7K
ScrapeGraphAI:一个提示词搞定网页抓取,无需编写规则智能网页内容提取工具

ScrapeGraphAI: A single cue word for web crawling, no need to write rules intelligent web content extraction tools

Comprehensive Introduction ScrapeGraphAI is an innovative Python web crawling library that cleverly combines Large Language Modeling (LLM) and Direct Graph Logic to create crawling pipelines for websites and local documents. The uniqueness of this tool lies in its perfect level of simplicity and power...
7mos ago
02.2K
AnkiAIUtils: Anki Flashcard Learning AI Toolset, an intelligent assistant that automatically optimizes memorized cards

AnkiAIUtils: Anki Flashcard Learning AI Toolset, an intelligent assistant that automatically optimizes memorized cards

General Description AnkiAIUtils is a set of AI-enhanced tools designed for the Anki flashcard learning system. Developed by a medical student, the tool is designed to automatically improve cards that users are struggling with during the learning process through AI technology. It can intelligently provide users with personalized...
7mos ago
02.9K
Story-Adapter:根据长篇故事生成连续且风格一致的图像插画

Story-Adapter: generating continuous and consistent graphic illustrations based on a long story

General Introduction Story-Adapter is an innovative story visualization framework that converts textual stories into coherent image sequences. Developed by researchers, this project employs an iterative approach that requires no training to generate high-quality story illustrations. The framework is characterized by its ability to handle long...
7mos ago
02.8K
ElizaOS:构建自主执行的多智能体,功能完备的开源AI智能体开发框架

ElizaOS: Building Autonomously Executing Multi-Intelligents, a Fully Functional Open Source AI Intelligent Body Development Framework

Comprehensive introduction Eliza is an advanced multi-intelligent body (Multi-Agent) development framework , is committed to simplifying the construction and deployment of autonomous intelligent body (Autonomous Agent) process . It supports the deployment of multiple intelligent bodies with different role settings , can realize intelligent ...
7mos ago
04.1K
Memary:利用知识图谱增强Agent长期记忆的开源项目

Memary: an open-source project to enhance Agent long-term memory using knowledge graphs

General Introduction Memary is an innovative open source project focused on providing long-term memory management solutions for autonomous intelligences. The project helps intelligences break through the limitations of traditional context windows to achieve smarter interaction experiences through knowledge graphs and specialized memory modules.Memary adopts...
7mos ago
04.4K
AI reads books:AI逐页阅读PDF书籍,自动提取知识要点并生成总结

AI reads books: AI reads PDF books page by page, automatically extracts the main points of knowledge and generates summaries.

Comprehensive Introduction AI-reads-books-page-by-page is a Python-based development of intelligent PDF book analysis tool, which can automate the page-by-page analysis of PDF books, extract the key knowledge points, and after the specified page interval to generate stage...
7mos ago
03.5K
AnyText:生成和编辑多语言图像文本,高可控在图像中生成多行中文

AnyText: Generate and edit multi-language image text, highly controllable to generate multiple lines of Chinese in the image

Comprehensive Introduction AnyText is a revolutionary multilingual visual text generation and editing tool developed based on the diffusion model. It generates natural, high-quality multilingual text in images and supports flexible text editing features. It was developed by a team of researchers and presented at ICLR 2024...
7mos ago
03.1K
AI2SRT:利用 Gemini模型,一键为长视频创建解说短视频或视频总结

AI2SRT: Create short narrated videos or video summaries for long videos with one click using Gemini models

Comprehensive Introduction AI2SRT is an open source project that utilizes the GeminiAI Big Model to generate short narrated videos and video summaries for long videos with one click, while supporting audio and video transcription subtitles. The project aims to simplify the video content creation process and provide efficient subtitle generation and translation functions. Users can pass...
8mos ago
03.1K
CogAgent:智谱开源的智能视觉语言模型,实现图形界面自动化操作

CogAgent: Smart Spectrum's open source intelligent visual language model for automating graphical interfaces

Comprehensive Introduction CogAgent is an open source visual language model developed by Tsinghua University Data Mining Research Group (THUDM), aiming to automate the operation of cross-platform graphical user interface (GUI). The model is based on CogVLM (GLM-4V-9B) and supports bilingual Chinese and English...
8mos ago
02.9K
DisPose:生成人体姿态精准控制的视频,创作跳舞的小姐姐

DisPose: generating videos with precise control of human posture, creating dancing ladies

General Introduction DisPose is an innovative open source artificial intelligence project focused on controlled character image animation generation. Developed by a team of researchers and open-sourced on GitHub, the project uses advanced deep learning techniques to achieve precise character animation control by decomposing skeletal pose information.D...
8mos ago
02.5K
Smolagents: open source project for rapid development of AI intelligences and lightweight construction of intelligences

Smolagents: open source project for rapid development of AI intelligences and lightweight construction of intelligences

Comprehensive Introduction Smolagents is a lightweight intelligent agent library developed by HuggingFace that focuses on simplifying the development process of AI agent systems. The project is known for its clean design philosophy, with only about 1000 lines of core code, yet provides powerful feature integration capabilities. It is most ...
7mos ago
04.2K
InvSR:开源图像超分辨率项目,提升图像分辨率质量

InvSR: Open source image super-resolution project to improve the quality of image resolution

General Introduction InvSR is an innovative open-source image super-resolution project based on diffusion inversion techniques capable of converting low-resolution images into high-quality, high-resolution images. The project utilizes the rich a priori knowledge of images embedded in pre-trained large-scale diffusion models to support, through a flexible sampling mechanism, the...
8mos ago
03.5K
Infinity:生成高分辨率图像的比特自回归建模,实现无限制高分辨率图像生成

Infinity: bitwise autoregressive modeling for generating high-resolution images for unlimited high-resolution image generation

General Introduction Infinity is a groundbreaking high-resolution image generation framework developed by the FoundationVision team. The project breaks through the limitations of traditional image generation models through an innovative bit-level visual autoregressive modeling approach.The core features of Infinity...
8mos ago
03.5K
GPTme:在命令行终端中运行的智能编程助手,ChatGPT代码解释器的本地化替代方案

GPTme: Intelligent Programming Assistant Running in a Command Line Terminal, Localized Alternative to ChatGPT Code Interpreter

Comprehensive Introduction GPTMe is a revolutionary terminal AI assistant tool designed to enhance developers' work efficiency. It perfectly combines powerful AI capabilities with the terminal environment, supporting diverse functions such as code execution, file editing, web browsing and visual recognition. As ChatGPT code solving...
8mos ago
02.9K
VideoSeal:先进的开源视频隐藏水印嵌入与提取工具,保护视频版权

VideoSeal: Advanced open source video hidden watermark embedding and extraction tools to protect video copyrights

General Introduction VideoSeal is an open source video watermarking tool developed by Facebook Research, designed to provide efficient video watermark embedding and extraction. The tool supports the latest open source models and contains pre-trained models, training code, inference code and evaluation tools...
8mos ago
02.8K