AI open source project

Total 1020 articles posts
小智 AI 聊天机器人:打造你的AI聊天伴侣,轻松实现语音对话和智能互动

Xiaozhi AI Chatbot: Build your AI chatting companion, easily realize voice conversation and intelligent interaction

Comprehensive Introduction Xiaozhi AI Chatbot is an open source project based on the ESP32 development board, designed to help users build their own AI chat companion. The project was developed by Shrimp and is mainly used for teaching purposes to help more people get started with AI hardware development and to understand how to apply large language models to real...
8mos ago
0109.1K
CosyVoice:阿里推出的3秒急速语音克隆开源项目,支持情感控制标签

CosyVoice: 3-second rush voice cloning open source project launched by Ali with support for emotionally controlled tags

Comprehensive Introduction CosyVoice is a multilingual large-scale speech generation model that provides full-stack capabilities from inference, training to deployment. Developed by the FunAudioLLM team, it aims to achieve high quality speech through advanced autoregressive transformers and ODE-based diffusion models...
10mos ago
090.9K
VisoMaster:强大且易用的图片/视频换脸和编辑软件

VisoMaster: Powerful and easy-to-use photo/video face changing and editing software

General Introduction VisoMaster is a powerful and easy-to-use video face-swapping and editing tool that utilizes artificial intelligence technology to achieve natural and realistic face-swapping effects. Whether it's an image or a video, VisoMaster can generate high-quality face swap results with simple operations, suitable for general...
9mos ago
085.6K
FunASR:开源语音识别工具包,说话人分离/ 多人对话语音识别

FunASR: Open Source Speech Recognition Toolkit, Speaker Separation / Multi-Person Conversation Speech Recognition

Comprehensive Introduction FunASR is an open source speech recognition toolkit developed by Alibaba's Dharma Institute to bridge academic research and industrial applications. It supports a wide range of speech recognition features, including speech recognition (ASR), voice endpoint detection (VAD), punctuation recovery, language modeling, speaking...
1yrs ago
082.5K
MinerU:PDF文档提取转换为多模态Markdown格式,支持电子书OCR扫描

MinerU: PDF document extraction and conversion to multimodal Markdown format, support e-book OCR scanning

Comprehensive Introduction MinerU is an open source data extraction tool developed by the OpenDataLab team at the Shanghai Artificial Intelligence Laboratory, focusing on efficiently extracting content from complex PDF documents, web pages, and eBooks. It can take multimodal PDFs containing images, formulas, tables and other elements...
1yrs ago
076.1K
EXO:利用闲置家用设备运行分布式AI集群,支持多种推理引擎和自动设备发现。

EXO: Running distributed AI clusters using idle home devices with support for multiple inference engines and automated device discovery.

General Introduction Exo is an open source project designed to run its own AI cluster using everyday devices (e.g. iPhone, iPad, Android, Mac, Linux, etc.). Through dynamic model partitioning and automated device discovery, Exo is able to unify multiple devices into one powerful...
1yrs ago
072.4K
Meetily:生成会议纪要的AI助手,实时转录和生成会议摘要

Meetily: an AI assistant for generating meeting minutes, transcribing and generating meeting summaries in real-time

General Description Meetily is an AI-powered meeting assistant developed by Zackriya Solutions that captures meeting audio in real-time, performs voice transcription, and generates meeting summaries. It is unique in that all processing is done locally on the device, ensuring user privacy...
10mos ago
070.1K
PDFMathTranslate:保留PDF完整排版的AI翻译工具

PDFMathTranslate: AI translation tool that preserves the full typography of PDFs

Comprehensive introduction PDFMathTranslate is an open source tool focusing on the translation of scientific papers , PDF documents can be translated in full and generate a bilingual version . It uses AI technology to retain the full layout of the original document , including formulas , diagrams , tables of contents and notes , support ...
6mos ago
067.7K
IOPaint:全能AI图像处理工具,擦除、扩图、替换元素与绘制文本

IOPaint: All-around AI image processing tool, erasing, expanding, replacing elements and drawing text.

General Introduction IOPaint is a free and open source AI image processing tool that supports image erasing, repairing and expanding. It uses state-of-the-art AI models to help users easily remove unwanted objects from an image, repair blemishes, add new content, and even expand an image.IOPa...
1yrs ago
065.1K
LiveTalking:开源实时互动数字人直播系统,实现音视频同步对话

LiveTalking: open source real-time interactive digital human live system, to achieve synchronous audio and video dialogues

Comprehensive introduction LiveTalking is an open source real-time interactive digital human system , is committed to building high-quality digital human live solution . The project uses the Apache 2.0 open source protocol and integrates a number of cutting-edge technologies , including ER-NeRF rendering , real-time audio and video streaming processing ...
11mos ago
064K
FunClip:智能剪辑视频内容为短片,轻松实现精准视频片段提取/裁剪

FunClip: Intelligent editing of video content into short clips, easy to realize accurate video clip extraction/cropping

Comprehensive Introduction FunClip is a fully open source localized automatic video editing tool developed by TONGYI Speech Lab of Alibaba Dharma Institute. The tool integrates the industrial-grade Paraformer-Large speech recognition model, which can accurately recognize the speech in the video...
11mos ago
058.5K
Danswer: 专注企业知识管理与文档搜索的AI助手,集成多种工作工具

Danswer: AI assistant specializing in enterprise knowledge management and document search, integrating multiple work tools

General Introduction Danswer is an open source enterprise document retrieval AI assistant designed to connect to team documents, applications and people to provide unified search and natural language query answers through an intelligent chat interface and unified search capabilities. Ensuring that user data and chats are fully controlled...
9mos ago
057.8K
FramePack:6G低显存快速生成长视频的开源项目

FramePack: 6G low graphics memory fast raw long video open source project

General Introduction FramePack is an open source video generation tool focused on making video diffusion techniques more practical. It decouples the generation workload from the video length by compressing the input frames to a fixed length through a unique next frame prediction neural network. This means that even when generating long videos, the video memory requirements...
7mos ago
055.9K
OpenBB:开源金融数据分析平台,集成私有数据集和 AI 来增强投资决策

OpenBB: Open Source Financial Data Analytics Platform Integrates Private Datasets and AI to Enhance Investment Decisions

General Introduction OpenBB is a free and fully open source financial data analytics platform designed to provide easy access to financial data and analytics tools for all. The platform integrates over 100 different data sources covering stocks, options, cryptocurrencies, forex, macroeconomic indicators, fixed...
10mos ago
052.7K
LibreChat:模仿ChatGPT界面交互的AI对话开源项目

LibreChat: mimic ChatGPT interface interaction AI dialog open source project

General Introduction LibreChat is a free, open source AI chat platform with extensive customization options and support for multiple AI providers, services and integrations. It brings together all AI conversations in one place with a familiar interface and innovative features, supporting multiple AI models, plugins and multiple languages. By...
1yrs ago
052K
Coqui TTS(xTTS):文本到语音生成的深度学习工具包,支持多种语言和声音克隆功能

Coqui TTS (xTTS): Deep Learning Toolkit for Text-to-Speech Generation with Multiple Language Support and Voice Cloning Capabilities

Comprehensive Introduction Coqui TTS is an open source advanced text-to-speech (TTS) generation toolkit based on deep learning techniques. It has been battle-tested in both research and production environments, and provides a rich set of features and models that support text-to-speech conversion in multiple languages.Coqui TTS...
10mos ago
051.2K
Chatbot UI:模仿ChatGPT界面和功能的开源AI聊天应用程序

Chatbot UI: an open source AI chat app that mimics ChatGPT's interface and functionality

General Introduction Chatbot UI is an open source project designed to help developers create personalized and intelligent conversational interfaces. The project provides a series of interface components and interactive features that can be easily integrated into the existing Chatbot system to provide users with a more fluent and intelligent dialog body...
1yrs ago
051.2K
XHS-Downloader:免费小红书数据采集工具,支持笔记批量下载、视频提取、图片去水印

XHS-Downloader: Free Xiaohongshu data collection tool, support notes batch download, video extraction, image watermarking

General Introduction XHS-Downloader is an open source tool designed for Xiaohongshu users to support extracting and downloading watermark-free images and video works on Xiaohongshu. The tool provides a variety of features, including getting cookies from browsers, support for command line operations, batch download...
1yrs ago
050.9K
Sim Studio:开源的AI代理工作流构建工具

Sim Studio: open source workflow builder for AI agents

Comprehensive Introduction Sim Studio is an open source AI agent workflow building platform focused on helping users quickly design, test, and deploy large-scale language model (LLM) workflows through a lightweight, intuitive visual interface. Users can create complex workflows without deep programming by dragging and dropping...
6mos ago
050.8K
Vexa:实时会议转录与智能知识提取工具

Vexa: a real-time meeting transcription and intelligent knowledge extraction tool

Comprehensive Introduction Vexa is an open source real-time meeting transcription and knowledge management platform designed to provide efficient meeting recording and intelligent knowledge extraction services for enterprises and individuals. It automatically joins platforms such as Google Meet, Zoom, etc. through API-driven meeting robots...
7mos ago
049.8K
Ragas:评估RAG召回QA准确率与答案相关性

Ragas: assessing RAG recall QA accuracy and answer correlation

Comprehensive Introduction Ragas is a tool specifically designed to evaluate and optimize Retrieval Augmented Generation (RAG) systems. It provides a comprehensive set of evaluation metrics by analyzing the relationships between queries, retrieval contexts, and generated answers. These metrics include fidelity, answer relevance, context relevance, on...
10mos ago
049.5K
VITA:开源视觉与语音实时交互的多模态大语言模型

VITA: Open Source Multimodal Large Language Model for Real-Time Interaction between Vision and Speech

General Introduction VITA is a leading open source interactive multimodal large language modeling project, pioneering the ability to achieve true full multimodal interaction. The project launched VITA-1.0 in August 2024, pioneering the first open source interactive fully-modal large language model.2024...
11mos ago
049.1K