AI open source project

Total 1020 articles posts
小智 AI 聊天机器人:打造你的AI聊天伴侣,轻松实现语音对话和智能互动

Xiaozhi AI Chatbot: Build your AI chatting companion, easily realize voice conversation and intelligent interaction

Comprehensive Introduction Xiaozhi AI Chatbot is an open source project based on the ESP32 development board, designed to help users build their own AI chat companion. The project was developed by Shrimp and is mainly used for teaching purposes to help more people get started with AI hardware development and to understand how to apply large language models to real...
10mos ago
0163.3K
CosyVoice:阿里推出的3秒急速语音克隆开源项目,支持情感控制标签

CosyVoice: 3-second rush voice cloning open source project launched by Ali with support for emotionally controlled tags

Comprehensive Introduction CosyVoice is a multilingual large-scale speech generation model that provides full-stack capabilities from inference, training to deployment. Developed by the FunAudioLLM team, it aims to achieve high quality speech through advanced autoregressive transformers and ODE-based diffusion models...
11mos ago
0128K
VisoMaster:强大且易用的图片/视频换脸和编辑软件

VisoMaster: Powerful and easy-to-use photo/video face changing and editing software

General Introduction VisoMaster is a powerful and easy-to-use video face-swapping and editing tool that utilizes artificial intelligence technology to achieve natural and realistic face-swapping effects. Whether it's an image or a video, VisoMaster can generate high-quality face swap results with simple operations, suitable for general...
11mos ago
0122.5K
FunASR:开源语音识别工具包,说话人分离/ 多人对话语音识别

FunASR: Open Source Speech Recognition Toolkit, Speaker Separation / Multi-Person Conversation Speech Recognition

Comprehensive Introduction FunASR is an open source speech recognition toolkit developed by Alibaba's Dharma Institute to bridge academic research and industrial applications. It supports a wide range of speech recognition features, including speech recognition (ASR), voice endpoint detection (VAD), punctuation recovery, language modeling, speaking...
1yrs ago
0110.8K
EXO:利用闲置家用设备运行分布式AI集群,支持多种推理引擎和自动设备发现。

EXO: Running distributed AI clusters using idle home devices with support for multiple inference engines and automated device discovery.

General Introduction Exo is an open source project designed to run its own AI cluster using everyday devices (e.g. iPhone, iPad, Android, Mac, Linux, etc.). Through dynamic model partitioning and automated device discovery, Exo is able to unify multiple devices into one powerful...
1yrs ago
0102.2K
MinerU:PDF文档提取转换为多模态Markdown格式,支持电子书OCR扫描

MinerU: PDF document extraction and conversion to multimodal Markdown format, support e-book OCR scanning

Comprehensive Introduction MinerU is an open source data extraction tool developed by the OpenDataLab team at the Shanghai Artificial Intelligence Laboratory, focusing on efficiently extracting content from complex PDF documents, web pages, and eBooks. It can take multimodal PDFs containing images, formulas, tables and other elements...
1yrs ago
0100.4K
PDFMathTranslate:保留PDF完整排版的AI翻译工具

PDFMathTranslate: AI translation tool that preserves the full typography of PDFs

Comprehensive introduction PDFMathTranslate is an open source tool focusing on the translation of scientific papers , PDF documents can be translated in full and generate a bilingual version . It uses AI technology to retain the full layout of the original document , including formulas , diagrams , tables of contents and notes , support ...
7mos ago
099.2K
Meetily:生成会议纪要的AI助手,实时转录和生成会议摘要

Meetily: an AI assistant for generating meeting minutes, transcribing and generating meeting summaries in real-time

General Description Meetily is an AI-powered meeting assistant developed by Zackriya Solutions that captures meeting audio in real-time, performs voice transcription, and generates meeting summaries. It is unique in that all processing is done locally on the device, ensuring user privacy...
11mos ago
093K
LiveTalking:开源实时互动数字人直播系统,实现音视频同步对话

LiveTalking: open source real-time interactive digital human live system, to achieve synchronous audio and video dialogues

Comprehensive introduction LiveTalking is an open source real-time interactive digital human system , is committed to building high-quality digital human live solution . The project uses the Apache 2.0 open source protocol and integrates a number of cutting-edge technologies , including ER-NeRF rendering , real-time audio and video streaming processing ...
1yrs ago
092.5K
IOPaint:全能AI图像处理工具,擦除、扩图、替换元素与绘制文本

IOPaint: All-around AI image processing tool, erasing, expanding, replacing elements and drawing text.

General Introduction IOPaint is a free and open source AI image processing tool that supports image erasing, repairing and expanding. It uses state-of-the-art AI models to help users easily remove unwanted objects from an image, repair blemishes, add new content, and even expand an image.IOPa...
1yrs ago
081K
Chatbot UI:模仿ChatGPT界面和功能的开源AI聊天应用程序

Chatbot UI: an open source AI chat app that mimics ChatGPT's interface and functionality

General Introduction Chatbot UI is an open source project designed to help developers create personalized and intelligent conversational interfaces. The project provides a series of interface components and interactive features that can be easily integrated into the existing Chatbot system to provide users with a more fluent and intelligent dialog body...
1yrs ago
080.2K
FunClip:智能剪辑视频内容为短片,轻松实现精准视频片段提取/裁剪

FunClip: Intelligent editing of video content into short clips, easy to realize accurate video clip extraction/cropping

Comprehensive Introduction FunClip is a fully open source localized automatic video editing tool developed by TONGYI Speech Lab of Alibaba Dharma Institute. The tool integrates the industrial-grade Paraformer-Large speech recognition model, which can accurately recognize the speech in the video...
1yrs ago
078.5K
FramePack:6G低显存快速生成长视频的开源项目

FramePack: 6G low graphics memory fast raw long video open source project

General Introduction FramePack is an open source video generation tool focused on making video diffusion techniques more practical. It decouples the generation workload from the video length by compressing the input frames to a fixed length through a unique next frame prediction neural network. This means that even when generating long videos, the video memory requirements...
8mos ago
078.2K
Coqui TTS(xTTS):文本到语音生成的深度学习工具包,支持多种语言和声音克隆功能

Coqui TTS (xTTS): Deep Learning Toolkit for Text-to-Speech Generation with Multiple Language Support and Voice Cloning Capabilities

Comprehensive Introduction Coqui TTS is an open source advanced text-to-speech (TTS) generation toolkit based on deep learning techniques. It has been battle-tested in both research and production environments, and provides a rich set of features and models that support text-to-speech conversion in multiple languages.Coqui TTS...
11mos ago
077.5K
XHS-Downloader:免费小红书数据采集工具,支持笔记批量下载、视频提取、图片去水印

XHS-Downloader: Free Xiaohongshu data collection tool, support notes batch download, video extraction, image watermarking

General Introduction XHS-Downloader is an open source tool designed for Xiaohongshu users to support extracting and downloading watermark-free images and video works on Xiaohongshu. The tool provides a variety of features, including getting cookies from browsers, support for command line operations, batch download...
1yrs ago
076.1K
R2R:多模态内容解析并结合知识图谱与混合搜索的先进AI检索(RAG)系统

R2R: An Advanced AI Retrieval (RAG) System for Multimodal Content Parsing and Combining Knowledge Graph with Hybrid Search

Comprehensive Introduction R2R (RAG to Riches) is an advanced AI retrieval system supporting Retrieval Augmented Generation (RAG) functionality with production-ready features. Built on a containerized RESTful API, the system provides multimodal content parsing, hybrid search functionality...
1yrs ago
070.9K
微信视频号下载器:快速下载微信视频号视频,支持多种格式和平台

WeChat Video No. Downloader: quickly download WeChat Video No. video, support multiple formats and platforms

Comprehensive Introduction WeChat Video No. Downloader is an open source project designed to help users quickly download video content from WeChat video numbers. The tool supports a variety of video formats and platforms, and users can easily use it on Windows and macOS systems. The project is developed by ltaoo and hosted on...
1yrs ago
070.8K
OpenBB:开源金融数据分析平台,集成私有数据集和 AI 来增强投资决策

OpenBB: Open Source Financial Data Analytics Platform Integrates Private Datasets and AI to Enhance Investment Decisions

General Introduction OpenBB is a free and fully open source financial data analytics platform designed to provide easy access to financial data and analytics tools for all. The platform integrates over 100 different data sources covering stocks, options, cryptocurrencies, forex, macroeconomic indicators, fixed...
12mos ago
069.2K
Danswer: 专注企业知识管理与文档搜索的AI助手,集成多种工作工具

Danswer: AI assistant specializing in enterprise knowledge management and document search, integrating multiple work tools

General Introduction Danswer is an open source enterprise document retrieval AI assistant designed to connect to team documents, applications and people to provide unified search and natural language query answers through an intelligent chat interface and unified search capabilities. Ensuring that user data and chats are fully controlled...
10mos ago
068.6K
Dify:生成式AI应用开发平台,可视化编排, 支持私有化部署

Dify: generative AI application development platform, visual orchestration, private deployment support

Comprehensive Introduction Dify is an open source generative AI application development platform designed to help developers rapidly build and operate native AI applications based on Large Language Models (LLMs). The platform provides everything from Agent building to AI workflow orchestration, RAG retrieval...
12mos ago
068.1K
VITA:开源视觉与语音实时交互的多模态大语言模型

VITA: Open Source Multimodal Large Language Model for Real-Time Interaction between Vision and Speech

General Introduction VITA is a leading open source interactive multimodal large language modeling project, pioneering the ability to achieve true full multimodal interaction. The project launched VITA-1.0 in August 2024, pioneering the first open source interactive fully-modal large language model.2024...
1yrs ago
068K