AI open source project

Total 1020 articles posts
VITA:开源视觉与语音实时交互的多模态大语言模型

VITA: Open Source Multimodal Large Language Model for Real-Time Interaction between Vision and Speech

General Introduction VITA is a leading open source interactive multimodal large language modeling project, pioneering the ability to achieve true full multimodal interaction. The project launched VITA-1.0 in August 2024, pioneering the first open source interactive fully-modal large language model.2024...
11mos ago
049.1K
OpenSPG:开源知识图谱引擎

OpenSPG: Open Source Knowledge Graph Engine

Comprehensive Introduction OpenSPG is an open source knowledge graph engine developed by Ant Group in collaboration with OpenKG, based on the SPG (Semantic Augmented Programmable Graph) framework. The engine is designed to provide features such as explicit semantic representation, logical rule definition and operational framework to support the construction and management of domain knowledge graphs...
1yrs ago
048.1K
WrenAI:对话式数据分析AI助手,直接获取答案、SQL查询与分析报表

WrenAI: Conversational Data Analytics AI Assistant with Direct Access to Answers, SQL Queries & Analytics Reports

General Introduction WrenAI is an open source SQL AI assistant specifically designed to help data teams, product teams and business teams gain data insights through natural language conversations. It is capable of converting natural language into SQL queries, generating charts, spreadsheets and reports, supporting multilingual...
11mos ago
047.8K
Linly-Dubbing:智能视频多语言AI配音/翻译工具

Linly-Dubbing: Intelligent Video Multilingual AI Dubbing/Translation Tool

Comprehensive Introduction Linly-Dubbing is an intelligent multilingual AI dubbing and translation tool designed to provide users with high-quality multilingual video dubbing and subtitle translation services by integrating advanced AI technology. The tool is especially suitable for international education, global content localization and other scenarios, helping...
10mos ago
047.7K
MaxKB:开箱即用的AI知识库问答系统,适合智能客服和企业内部知识库

MaxKB: Out-of-the-box AI Knowledge Base Q&A System for Smart Customer Service and In-house Knowledge Base

Comprehensive Introduction MaxKB (Max Knowledge Base) is an open source knowledge base Q&A system based on large language modeling and RAG (Retrieval Augmented Generation). The system is widely used in intelligent customer service, enterprise internal knowledge base, academic research and education and other scenarios.MaxKB...
10mos ago
047.5K
RAGFlow:基于深度文档理解的开源RAG引擎,提供高效的检索增强生成工作流

RAGFlow: an open source RAG engine based on deep document understanding, providing efficient retrieval-enhanced generation workflows

Comprehensive Introduction RAGFlow is an open source Retrieval Augmented Generation (RAG) engine based on deep document understanding technology. It provides an efficient RAG workflow for organizations of all sizes, incorporating a large-scale language model (LLM) capable of delivering data in complex formats based on real...
10mos ago
047.2K
Dify:生成式AI应用开发平台,可视化编排, 支持私有化部署

Dify: generative AI application development platform, visual orchestration, private deployment support

Comprehensive Introduction Dify is an open source generative AI application development platform designed to help developers rapidly build and operate native AI applications based on Large Language Models (LLMs). The platform provides everything from Agent building to AI workflow orchestration, RAG retrieval...
10mos ago
047K
Smolagents: open source project for rapid development of AI intelligences and lightweight construction of intelligences

Smolagents: open source project for rapid development of AI intelligences and lightweight construction of intelligences

Comprehensive Introduction Smolagents is a lightweight intelligent agent library developed by HuggingFace that focuses on simplifying the development process of AI agent systems. The project is known for its clean design philosophy, with only about 1000 lines of core code, yet provides powerful feature integration capabilities. It is most ...
11mos ago
046.9K
Browser Use Web UI:运行AI智能体浏览网页,让AI能够自动操作网页的开源框架

Browser Use Web UI: an open source framework for running AI intelligences to browse the web, allowing AI to automatically manipulate web pages

Comprehensive Introduction Browser Use Web UI is an innovative open source project focused on providing AI agents with a graphical interface tool for browser interaction capabilities. The project is built on top of the browser-use core framework, built with Gradio ...
6mos ago
046.4K
微信视频号下载器:快速下载微信视频号视频,支持多种格式和平台

WeChat Video No. Downloader: quickly download WeChat Video No. video, support multiple formats and platforms

Comprehensive Introduction WeChat Video No. Downloader is an open source project designed to help users quickly download video content from WeChat video numbers. The tool supports a variety of video formats and platforms, and users can easily use it on Windows and macOS systems. The project is developed by ltaoo and hosted on...
11mos ago
046.1K
A2A:谷歌发布AI智能间通信的开放协议

A2A: Google releases open protocol for communication between AI intelligences

General Introduction A2A (Agent2Agent) is an open source protocol developed by Google to allow AI intelligences developed by different frameworks or vendors to communicate and collaborate with each other. It provides a standardized set of methods for intelligences to discover each other's capabilities, share tasks, and complete work...
8mos ago
046K
OmniSVG:从文本和图像生成SVG矢量图形的开源项目

OmniSVG: from text and images to generate SVG vector graphics open source project

General Introduction OmniSVG is an open source project focused on generating high-quality vector graphics (SVG) through a multimodal model. It utilizes pre-trained visual-linguistic models to support SVG generation from textual descriptions or image input, covering a wide range of scenarios from simple icons to complex anime characters. Item ...
8mos ago
045.9K
bilive:B站无人监守直播录制与自动切片、上传工具

bilive: Unsupervised live recording and automatic slicing and uploading tools for B station

Comprehensive Introduction bilive is a tool designed for B station live recording, providing extremely fast live recording, auto-slicing, pop-up rendering and subtitle generation. The tool is compatible with ultra-low configuration machines, supports 7x24 hours unattended recording, automatically recognizes and renders pop-ups and subtitles, automatically slices and...
10mos ago
045.9K
Linly-Talker:数字人智能对话系统,结合大语言模型与视觉模型,实现互动新体验

Linly-Talker: An Intelligent Dialogue System for Digital People, Combining Big Language Modeling and Visual Modeling for a New Interactive Experience

Comprehensive Introduction Linly-Talker is an innovative digital human dialog system that combines Large Language Models (LLMs) with visual models to create a novel approach to human-computer interaction. The system integrates a variety of technologies such as Whisper, Linly, Micros...
10mos ago
045.6K
Qlib:微软开发的AI量化投资研究工具

Qlib: an AI quantitative investment research tool developed by Microsoft

Comprehensive Introduction Qlib is an open source platform developed by Microsoft that focuses on using AI technology to help users research quantitative investments. It starts from the most basic data processing and supports users to explore investment ideas and turn them into usable strategies. The platform is simple and easy to use, and is suitable for those who want to use machine learning to improve their investment research...
8mos ago
045.3K
MetaGPT:多智能体协作框架,构建 AI 软件开发团队实现自然语言编程

MetaGPT: A Multi-Intelligence Collaboration Framework for Building AI Software Development Teams for Natural Language Programming

Comprehensive Introduction MetaGPT is an innovative multi-intelligence body framework designed to model the operations of a complete AI software company. Created by geekan (Alexander Wu), the goal of the project is to combine GPT models with different roles into a collaborative entity...
9mos ago
044.9K
cognee:基于知识图谱构建的RAG开源框架,核心prompts学习

cognee: a RAG open source framework for knowledge graph based construction, core prompts learning

General Introduction Cognee is a reliable data layer solution designed for AI applications and AI agents. Designed to load and build LLM (Large Language Model) contexts to create accurate and interpretable AI solutions through knowledge graphs and vector stores. The framework favors cost-saving, interpretable...
10mos ago
044.5K
TRV:将幻灯片/PPT和讲解备注快速生成演讲视频

TRV: Rapidly Generate Presentation Videos from Slides/PPTs and Explanatory Notes

General Introduction TRV is an open source tool, hosted on GitHub, designed to help users quickly convert slides and presentation notes into videos with narration. It automatically generates audio and video content from incoming presentation files through simple command line operations, suitable for those who need to quickly create presentations...
9mos ago
044.4K
ElizaOS:构建自主执行的多智能体,功能完备的开源AI智能体开发框架

ElizaOS: Building Autonomously Executing Multi-Intelligents, a Fully Functional Open Source AI Intelligent Body Development Framework

Comprehensive introduction Eliza is an advanced multi-intelligent body (Multi-Agent) development framework , is committed to simplifying the construction and deployment of autonomous intelligent body (Autonomous Agent) process . It supports the deployment of multiple intelligent bodies with different role settings , can realize intelligent ...
11mos ago
044.2K
RMBG-2-Studio:批量移除图像和视频背景的开源程序,基于RMBG 2.0优化

RMBG-2-Studio: open source program for batch removal of image and video backgrounds, optimized for RMBG 2.0

General Introduction RMBG-2-Studio is an enhanced background removal and replacement application developed based on the BRIA-RMBG-2.0 model. The application is designed to provide users with efficient and accurate image background processing capabilities for a variety of image types, including e-commerce, gaming and...
12mos ago
043.9K
Orion:小米开源的端到端自动驾驶推理与规划框架

Orion: Xiaomi's Open Source End-to-End Autonomous Driving Reasoning and Planning Framework

Comprehensive Introduction Orion is an open source project developed by Xiaomi Labs, focusing on end-to-end (E2E) autonomous driving technology. It solves the problem of insufficient causal reasoning in complex scenarios of traditional autonomous driving approaches through visual language modeling (VLM) and generative planners.Orion integrates long...
8mos ago
043.9K
SP-MangaEditer:专业四格漫画插图创作工具,生成图像、编辑漫画页面

SP-MangaEditer: Professional four-panel manga illustration creation tool, generating images, editing manga pages

General Introduction SP-MangaEditer is an independent manga editing platform designed for manga creators. The platform supports image generation, layer editing, image adjustment, filter application and many other functions to help users easily create high-quality manga illustrations. Users can simply manipulate...
11mos ago
043.9K
CrewAI:多角色扮演协作智能框架,简化复杂任务

CrewAI: A Multi-Roleplay Collaborative Intelligence Framework to Simplify Complex Tasks

Comprehensive Introduction CrewAI is an advanced framework designed to orchestrate collaboration between role-playing and autonomous AI agents. By facilitating collaborative intelligence, CrewAI enables agents to work together seamlessly to solve complex tasks. Whether you're building an intelligent assistant platform, automating customer service teams, or multi-agent...
12mos ago
043.6K
MatAnyone: 提取视频指定目标人像的开源工具,生成目标人像视频

MatAnyone: Extract video to specify the target portrait of the open-source tool to generate the target portrait video

General Introduction MatAnyone is an open source project focusing on video keying, developed and released on GitHub by a research team at S-Lab, Nanyang Technological University, Singapore. It provides users with stable and efficient video processing capabilities through coherent memory propagation techniques, especially...
9mos ago
043.5K
Deep Live Cam:开源的实时AI换脸工具,一张照片就能实现实时换脸直播

Deep Live Cam: open source real-time AI face-swapping tool, a photo can realize real-time face-swapping live

General Introduction Deep Live Cam is an open source artificial intelligence tool designed to enable real-time face replacement and deep fake video generation from a single photo. The tool utilizes advanced deep learning algorithms to enable real-time face replacement in live streams or video calls, protecting user privacy and adding fun...
1yrs ago
043.3K
Dify-WebUI:基于Dify API的桌面智能对话客户端,提供企业级AI对话能力

Dify-WebUI: Desktop Intelligent Conversation Client based on Dify API, providing enterprise-grade AI conversation capabilities

Comprehensive Introduction Dify-WebUI is a modern desktop smart conversation app based on the Dify API, designed to provide enterprises with powerful AI conversation capabilities. The application supports a variety of preset theme colors to meet the personalized needs of enterprises, and has a knowledge base management function to support...
11mos ago
042.7K
小红书AI运营助手:自动生成和发布小红书文章

Xiaohongshu AI operation assistant: automatically generate and publish Xiaohongshu articles

Comprehensive Introduction Xiaohongshu AI Operation Assistant (xhsaipublisher) is an automation tool designed for publishing articles on the Xiaohongshu platform. The program combines a graphical user interface with automation scripts that utilize big model technology to generate content and automatically log in and publish via browser...
11mos ago
042.7K
TRELLIS:Microsoft开发的3D资产生成模型,支持多种格式和灵活编辑

TRELLIS: Microsoft-developed 3D asset generation model with multiple format support and flexible editing

General Introduction TRELLIS is a large-scale 3D asset generation model developed by Microsoft. It is capable of receiving text or image prompts and generating high-quality 3D assets in a variety of formats, such as radial fields, 3D Gaussians, and meshes.At the heart of TRELLIS is a unified structured latent...
12mos ago
042.6K