AI open source project

Total 1020 articles posts
CosyVoice:阿里推出的3秒急速语音克隆开源项目,支持情感控制标签

CosyVoice: 3-second rush voice cloning open source project launched by Ali with support for emotionally controlled tags

Comprehensive Introduction CosyVoice is a multilingual large-scale speech generation model that provides full-stack capabilities from inference, training to deployment. Developed by the FunAudioLLM team, it aims to achieve high quality speech through advanced autoregressive transformers and ODE-based diffusion models...
8mos ago
062K
小智 AI 聊天机器人:打造你的AI聊天伴侣,轻松实现语音对话和智能互动

Xiaozhi AI Chatbot: Build your AI chatting companion, easily realize voice conversation and intelligent interaction

Comprehensive Introduction Xiaozhi AI Chatbot is an open source project based on the ESP32 development board, designed to help users build their own AI chat companion. The project was developed by Shrimp and is mainly used for teaching purposes to help more people get started with AI hardware development and to understand how to apply large language models to real...
7mos ago
053.6K
IOPaint:全能AI图像处理工具,擦除、扩图、替换元素与绘制文本

IOPaint: All-around AI image processing tool, erasing, expanding, replacing elements and drawing text.

General Introduction IOPaint is a free and open source AI image processing tool that supports image erasing, repairing and expanding. It uses state-of-the-art AI models to help users easily remove unwanted objects from an image, repair blemishes, add new content, and even expand an image.IOPa...
12mos ago
049.9K
VisoMaster:强大且易用的图片/视频换脸和编辑软件

VisoMaster: Powerful and easy-to-use photo/video face changing and editing software

General Introduction VisoMaster is a powerful and easy-to-use video face-swapping and editing tool that utilizes artificial intelligence technology to achieve natural and realistic face-swapping effects. Whether it's an image or a video, VisoMaster can generate high-quality face swap results with simple operations, suitable for general...
8mos ago
049.5K
EXO:利用闲置家用设备运行分布式AI集群,支持多种推理引擎和自动设备发现。

EXO: Running distributed AI clusters using idle home devices with support for multiple inference engines and automated device discovery.

General Introduction Exo is an open source project designed to run its own AI cluster using everyday devices (e.g. iPhone, iPad, Android, Mac, Linux, etc.). Through dynamic model partitioning and automated device discovery, Exo is able to unify multiple devices into one powerful...
11mos ago
047.1K
MinerU:PDF文档提取转换为多模态Markdown格式,支持电子书OCR扫描

MinerU: PDF document extraction and conversion to multimodal Markdown format, support e-book OCR scanning

Comprehensive Introduction MinerU is an open source data extraction tool developed by the OpenDataLab team at the Shanghai Artificial Intelligence Laboratory, focusing on efficiently extracting content from complex PDF documents, web pages, and eBooks. It can take multimodal PDFs containing images, formulas, tables and other elements...
1yrs ago
045.4K
FunASR:开源语音识别工具包,说话人分离/ 多人对话语音识别

FunASR: Open Source Speech Recognition Toolkit, Speaker Separation / Multi-Person Conversation Speech Recognition

Comprehensive Introduction FunASR is an open source speech recognition toolkit developed by Alibaba's Dharma Institute to bridge academic research and industrial applications. It supports a wide range of speech recognition features, including speech recognition (ASR), voice endpoint detection (VAD), punctuation recovery, language modeling, speaking...
12mos ago
042.3K
Meetily:生成会议纪要的AI助手,实时转录和生成会议摘要

Meetily: an AI assistant for generating meeting minutes, transcribing and generating meeting summaries in real-time

General Description Meetily is an AI-powered meeting assistant developed by Zackriya Solutions that captures meeting audio in real-time, performs voice transcription, and generates meeting summaries. It is unique in that all processing is done locally on the device, ensuring user privacy...
8mos ago
039.7K
Danswer: 专注企业知识管理与文档搜索的AI助手,集成多种工作工具

Danswer: AI assistant specializing in enterprise knowledge management and document search, integrating multiple work tools

General Introduction Danswer is an open source enterprise document retrieval AI assistant designed to connect to team documents, applications and people to provide unified search and natural language query answers through an intelligent chat interface and unified search capabilities. Ensuring that user data and chats are fully controlled...
7mos ago
039.7K
LiveTalking:开源实时互动数字人直播系统,实现音视频同步对话

LiveTalking: open source real-time interactive digital human live system, to achieve synchronous audio and video dialogues

Comprehensive introduction LiveTalking is an open source real-time interactive digital human system , is committed to building high-quality digital human live solution . The project uses the Apache 2.0 open source protocol and integrates a number of cutting-edge technologies , including ER-NeRF rendering , real-time audio and video streaming processing ...
9mos ago
037.6K
FunClip:智能剪辑视频内容为短片,轻松实现精准视频片段提取/裁剪

FunClip: Intelligent editing of video content into short clips, easy to realize accurate video clip extraction/cropping

Comprehensive Introduction FunClip is a fully open source localized automatic video editing tool developed by TONGYI Speech Lab of Alibaba Dharma Institute. The tool integrates the industrial-grade Paraformer-Large speech recognition model, which can accurately recognize the speech in the video...
9mos ago
036.5K
OpenBB:开源金融数据分析平台,集成私有数据集和 AI 来增强投资决策

OpenBB: Open Source Financial Data Analytics Platform Integrates Private Datasets and AI to Enhance Investment Decisions

General Introduction OpenBB is a free and fully open source financial data analytics platform designed to provide easy access to financial data and analytics tools for all. The platform integrates over 100 different data sources covering stocks, options, cryptocurrencies, forex, macroeconomic indicators, fixed...
9mos ago
035.9K
FramePack:6G低显存快速生成长视频的开源项目

FramePack: 6G low graphics memory fast raw long video open source project

General Introduction FramePack is an open source video generation tool focused on making video diffusion techniques more practical. It decouples the generation workload from the video length by compressing the input frames to a fixed length through a unique next frame prediction neural network. This means that even when generating long videos, the video memory requirements...
5mos ago
035.8K
PDFMathTranslate:保留PDF完整排版的AI翻译工具

PDFMathTranslate: AI translation tool that preserves the full typography of PDFs

Comprehensive introduction PDFMathTranslate is an open source tool focusing on the translation of scientific papers , PDF documents can be translated in full and generate a bilingual version . It uses AI technology to retain the full layout of the original document , including formulas , diagrams , tables of contents and notes , support ...
4mos ago
035.3K
Chatbot UI:模仿ChatGPT界面和功能的开源AI聊天应用程序

Chatbot UI: an open source AI chat app that mimics ChatGPT's interface and functionality

General Introduction Chatbot UI is an open source project designed to help developers create personalized and intelligent conversational interfaces. The project provides a series of interface components and interactive features that can be easily integrated into the existing Chatbot system to provide users with a more fluent and intelligent dialog body...
1yrs ago
035.2K
A2A:谷歌发布AI智能间通信的开放协议

A2A: Google releases open protocol for communication between AI intelligences

General Introduction A2A (Agent2Agent) is an open source protocol developed by Google to allow AI intelligences developed by different frameworks or vendors to communicate and collaborate with each other. It provides a standardized set of methods for intelligences to discover each other's capabilities, share tasks, and complete work...
6mos ago
035.1K
Ragas:评估RAG召回QA准确率与答案相关性

Ragas: assessing RAG recall QA accuracy and answer correlation

Comprehensive Introduction Ragas is a tool specifically designed to evaluate and optimize Retrieval Augmented Generation (RAG) systems. It provides a comprehensive set of evaluation metrics by analyzing the relationships between queries, retrieval contexts, and generated answers. These metrics include fidelity, answer relevance, context relevance, on...
9mos ago
034.4K
LibreChat:模仿ChatGPT界面交互的AI对话开源项目

LibreChat: mimic ChatGPT interface interaction AI dialog open source project

General Introduction LibreChat is a free, open source AI chat platform with extensive customization options and support for multiple AI providers, services and integrations. It brings together all AI conversations in one place with a familiar interface and innovative features, supporting multiple AI models, plugins and multiple languages. By...
1yrs ago
033.6K
VITA:开源视觉与语音实时交互的多模态大语言模型

VITA: Open Source Multimodal Large Language Model for Real-Time Interaction between Vision and Speech

General Introduction VITA is a leading open source interactive multimodal large language modeling project, pioneering the ability to achieve true full multimodal interaction. The project launched VITA-1.0 in August 2024, pioneering the first open source interactive fully-modal large language model.2024...
9mos ago
032.9K
MaxKB:开箱即用的AI知识库问答系统,适合智能客服和企业内部知识库

MaxKB: Out-of-the-box AI Knowledge Base Q&A System for Smart Customer Service and In-house Knowledge Base

Comprehensive Introduction MaxKB (Max Knowledge Base) is an open source knowledge base Q&A system based on large language modeling and RAG (Retrieval Augmented Generation). The system is widely used in intelligent customer service, enterprise internal knowledge base, academic research and education and other scenarios.MaxKB...
9mos ago
031.6K
RAGFlow:基于深度文档理解的开源RAG引擎,提供高效的检索增强生成工作流

RAGFlow: an open source RAG engine based on deep document understanding, providing efficient retrieval-enhanced generation workflows

Comprehensive Introduction RAGFlow is an open source Retrieval Augmented Generation (RAG) engine based on deep document understanding technology. It provides an efficient RAG workflow for organizations of all sizes, incorporating a large-scale language model (LLM) capable of delivering data in complex formats based on real...
9mos ago
031.2K
Smolagents: open source project for rapid development of AI intelligences and lightweight construction of intelligences

Smolagents: open source project for rapid development of AI intelligences and lightweight construction of intelligences

Comprehensive Introduction Smolagents is a lightweight intelligent agent library developed by HuggingFace that focuses on simplifying the development process of AI agent systems. The project is known for its clean design philosophy, with only about 1000 lines of core code, yet provides powerful feature integration capabilities. It is most ...
9mos ago
031.2K
MetaGPT:多智能体协作框架,构建 AI 软件开发团队实现自然语言编程

MetaGPT: A Multi-Intelligence Collaboration Framework for Building AI Software Development Teams for Natural Language Programming

Comprehensive Introduction MetaGPT is an innovative multi-intelligence body framework designed to model the operations of a complete AI software company. Created by geekan (Alexander Wu), the goal of the project is to combine GPT models with different roles into a collaborative entity...
7mos ago
030.8K