AI open source project

Total 1020 articles posts
ReCamMaster:从单一视频生成多视角视频的渲染工具

ReCamMaster: Rendering Tool for Generating Multi-View Videos from a Single Video

General Introduction ReCamMaster is an open source video processing tool, the core function is to generate new camera views from a single video. Users can specify the camera track and re-render the video to get a dynamic picture with different angles. It is developed by a team of Zhejiang University and Racer Technology, based on text-to...
1yrs ago
061.4K
MagicArticulate:将静态3D模型生成骨骼结构动画资产

MagicArticulate: generating skeletal structure animation assets from static 3D models

Comprehensive Introduction MagicArticulate is an AI framework developed by ByteDance in collaboration with Nanyang Technological University, focusing on rapidly transforming static 3D models into animation-enabled digital assets. It does this through an advanced autoregressive Transformer and functional diffusion modeling, self...
1yrs ago
061.3K
Search o1:赋予推理模型主动搜索能力,让大模型边思考边搜索外部知识

Search o1: Empowering inference models to actively search for external knowledge while the larger model is thinking

Comprehensive Introduction Search-o1 is an open source project that aims to enhance the performance of large-scale reasoning models (LRMs) by integrating advanced search mechanisms. The core idea is to solve the knowledge deficit problem encountered in the reasoning process through dynamic search and knowledge integration. The project was developed by sunn...
1yrs ago
061.3K
ExtractThinker:提取和分类文档为结构化数据,优化文档处理流程

ExtractThinker: extracting and classifying documents into structured data to optimize the document processing flow

Comprehensive Introduction ExtractThinker is a flexible document intelligence tool that utilizes Large Language Models (LLMs) to extract and classify structured data from documents, providing a seamless ORM-like document processing workflow. It supports a variety of document loaders, including Tess...
1yrs ago
061.2K
Orchestra: Building Smart AI Teams for Easier and More Efficient Multi-Intelligence Collaborative Development

Orchestra: Building Smart AI Teams for Easier and More Efficient Multi-Intelligence Collaborative Development

Comprehensive Introduction Orchestra is an innovative lightweight Python framework that focuses on building multi-intelligence collaborative systems based on the Large Language Model (LLM). It employs a unique method of arranging intelligences so that multiple AI intelligences can work together harmoniously like a symphony orchestra. By modeling ...
1yrs ago
060.9K
Magic 1-For-1: 高效生成视频的开源项目,号称在一分钟内生成一分钟的视频

Magic 1-For-1: efficient generation of video open source project that claims to generate a minute of video in one minute

Comprehensive Introduction Magic 1-For-1 is an efficient video generation model designed to optimize memory usage and reduce inference latency. The model decomposes the text-to-video generation task into two subtasks: text-to-image generation and image-to-video generation, enabling more efficient training and distillation...
1yrs ago
060.9K
Harbor:一键部署本地LLM开发环境,轻松管理和运行AI服务的容器化工具集

Harbor: a containerized toolset for easily managing and running AI services with one-click deployment of local LLM development environments

Comprehensive Introduction Harbor is a revolutionary containerized LLM toolset focused on simplifying the deployment and management of local AI development environments. It enables developers with a clean command line interface (CLI) and companion application to launch and manage with a single click, including LLM backends, API interfaces, front...
1yrs ago
060.8K
Moondream:批量反推图像提示词的开源轻量级视觉语言模型

Moondream: an open source lightweight visual language model for batch backpropagation of image cue words

Comprehensive Introduction Moondream is an open source lightweight visual language model designed to enable image description capabilities through deep learning and computer vision techniques. The model is able to run efficiently on a variety of platforms and is particularly suitable for edge devices.Moondream uses advanced techniques and...
1yrs ago
060.7K
Mahilo:连接不同AI智能体框架实时协作的集成平台

Mahilo: an integrated platform for connecting different AI intelligences frameworks to collaborate in real time

General Introduction Mahilo is an open source multi-intelligence integration platform, released on GitHub by developer Jayesh Sharma, designed to help users connect AI intelligences from different frameworks to support real-time communication, human-computer interaction, and intelligent collaboration. The ...
1yrs ago
060.6K
PantoMatrix(EMAGE):全身手势生成框架,从音频生成全身手势的3D动画框架

PantoMatrix (EMAGE): full-body gesture generation framework, 3D animation framework for generating full-body gestures from audio

Comprehensive Introduction PantoMatrix is an advanced full-body gesture generation framework capable of generating complete human movements from audio and partial gestures, including face, partial body, hand and full-body movements. The framework utilizes the latest multimodal datasets and deep learning techniques to provide high-quality 3D...
1yrs ago
060.6K
Cooragent:一句话构建多智能体任务协作工具

Cooragent: building a multi-intelligence task collaboration tool in one sentence

General Introduction Cooragent is an open source AI agent collaboration framework developed by LeapLab at Tsinghua University and hosted on GitHub.It allows users to create intelligent AI agents with a one-sentence description and supports multiple agents to collaborate on complex tasks. The framework provides two...
11mos ago
060.5K
Deep Research:基于AI的深度研究助手,提供高效的研究工具和报告生成功能

Deep Research: an AI-based deep research assistant that provides efficient research tools and report generation capabilities

General Introduction Deep Research is an AI-based research assistant designed to perform iterative deep research by combining search engines, web crawling, and large language models. The project was released by dzhng on GitHub with the goal of providing an easy-to-use deep research genera...
1yrs ago
060.3K
CogView4:生成中英双语高清图片的开源文生图模型

CogView4: An Open Source Literature Graph Model for Generating Bilingual HD Images

General Introduction CogView4 is an open source text-to-graph model developed by the KEG Lab (THUDM) at Tsinghua University, focusing on converting text descriptions into high-quality images. It supports bilingual cue word input, and is especially good at understanding Chinese cues and generating images with Chinese characters, non...
1yrs ago
059.9K
PromptWizard:优化提示工程的开源框架,提升任务性能

PromptWizard: an open source framework for optimizing prompt projects to improve task performance

Comprehensive Introduction PromptWizard is an open source framework developed by Microsoft that uses a self-evolutionary mechanism that allows the model to generate, evaluate, and improve prompt words and generate examples on its own, improving the quality of the output through continuous feedback. It can autonomously optimize the prompt words, generate and select appropriate examples, and...
1yrs ago
059.7K
BuffGPT:企业级生成式AI应用低代码开发平台

BuffGPT: A Low-Code Development Platform for Enterprise-Grade Generative AI Applications

Comprehensive Introduction BuffGPT is an open source AI application development platform based on the Large Language Model (LLM), providing out-of-the-box features such as data processing, model invocation, RAG retrieval, and visual workflow orchestration to help users easily build and operate generative AI applications. The platform supports privatization...
1yrs ago
059.7K
AIHawk:智能求职助手,自动化投放简历(限英文)

AIHawk: Intelligent Job Search Assistant, Automated Resume Placement (English only)

General Introduction Auto_Jobs_Applier_AIHawk is a tool to automate job search using artificial intelligence technology. It helps users to automatically deliver a large number of resumes in a short period of time and personalize them according to their personal information and job search intentions. The tool is designed to raise...
1yrs ago
059.6K
LLManager:智能自动化流程审批与人类审核结合的管理工具

LLManager: a management tool that combines intelligent automated process approvals with human reviews

Comprehensive Introduction LLManager is an open source intelligent approval management tool, developed based on LangChain's LangGraph framework, focused on automating the processing of approval requests while optimizing decision making with human review. It does this through semantic search, sample less learning and...
12mos ago
059.6K
自动解析PDF内容并提取文字与表格的开源服务

Automatically parse PDF content and extract text and tables of open source services

Comprehensive Introduction It can automatically analyze the layout of PDF documents, identify text, titles, images, tables, formulas and other elements in the page, and determine their correct order. The tool supports OCR functionality and can convert scanned PDF to searchable text. It runs on Docker and provides two models...
1yrs ago
059.5K
ChatOllama:基于Nuxt 3和Ollama的本地实时聊天应用UI

ChatOllama: Native real-time chat application UI based on Nuxt 3 and Ollama

Comprehensive introduction ChatOllama is an open source online chat application project based on a large language model (LLM) , supporting numerous language models and knowledge base management. Users can use the platform for model management ( list display , download , delete ) , chat with the model and other functions . The project utilizes ...
2yrs ago
059.5K
LangGraph Supervisor:利用监督智能体来管理多智能体协作的工具

LangGraph Supervisor: a tool for managing multi-intelligence collaboration using supervising intelligences

Comprehensive Introduction LangGraph Supervisor is a Python library based on the LangGraph framework, designed for creating and managing multi-intelligent body systems. The library coordinates the work of multiple specialized agents through a central supervisory agent, ensuring that communication flows and tasks are divided...
1yrs ago
059.4K
Director:智能视频代理框架,用自然语言描述执行视频搜索、编辑和生成工作流

Director: Intelligent Video Agent Framework for Performing Video Search, Editing, and Generation Workflows with Natural Language Descriptions

General Introduction Director is an open source framework designed to simplify and optimize video interactions and workflows by building intelligent video agents. The framework is based on VideoDB's "video-as-data" infrastructure and is capable of handling complex video tasks such as searching, editing, compiling and generating...
1yrs ago
059.2K
RAG Web UI:构建智能文档问答系统,简单构建私有Web端知识库

RAG Web UI: Building an Intelligent Documentation Q&A System and Simply Building a Private Web-Side Knowledge Base

Comprehensive Introduction RAG Web UI is an intelligent dialog system based on RAG (Retrieval Augmented Generation) technology. It helps organizations and individuals build intelligent Q&A systems based on their own knowledge base. By combining document retrieval and large language modeling, RAG Web UI provides accurate and reliable...
1yrs ago
059K