AI open source project

Total 1020 articles posts
MagicArticulate:将静态3D模型生成骨骼结构动画资产

MagicArticulate: generating skeletal structure animation assets from static 3D models

Comprehensive Introduction MagicArticulate is an AI framework developed by ByteDance in collaboration with Nanyang Technological University, focusing on rapidly transforming static 3D models into animation-enabled digital assets. It does this through an advanced autoregressive Transformer and functional diffusion modeling, self...
8mos ago
023.7K
MindSearch:开源AI搜索引擎框架,部署您自己的 Perplexity 搜索引擎!

MindSearch: open source AI search engine framework to deploy your own Perplexity search engine!

Comprehensive Introduction MindSearch is an open source AI search engine framework launched by Shanghai Artificial Intelligence Laboratory (SAL), aiming to simulate human thought process for complex information gathering and integration. The tool combines the advanced technology of large-scale language modeling (LLM) and search engine through multi-intelligence...
10mos ago
023.7K
自动解析PDF内容并提取文字与表格的开源服务

Automatically parse PDF content and extract text and tables of open source services

Comprehensive Introduction It can automatically analyze the layout of PDF documents, identify text, titles, images, tables, formulas and other elements in the page, and determine their correct order. The tool supports OCR functionality and can convert scanned PDF to searchable text. It runs on Docker and provides two models...
6mos ago
023.7K
Orchestra: Building Smart AI Teams for Easier and More Efficient Multi-Intelligence Collaborative Development

Orchestra: Building Smart AI Teams for Easier and More Efficient Multi-Intelligence Collaborative Development

Comprehensive Introduction Orchestra is an innovative lightweight Python framework that focuses on building multi-intelligence collaborative systems based on the Large Language Model (LLM). It employs a unique method of arranging intelligences so that multiple AI intelligences can work together harmoniously like a symphony orchestra. By modeling ...
9mos ago
023.7K
Kheish:多角色智能体,审查、验证和格式化输出以生成高质量结果

Kheish: multi-actor intelligences that review, validate and format output to produce high quality results

Comprehensive Introduction Kheish is an open source multi-role agent designed for Large Language Model (LLM) tasks that require structured, step-by-step collaboration.Kheish is more than just a simple coordinator, it is an intelligent agent in its own right, requesting modules on demand, integrating user-reversal...
9mos ago
023.6K
AI Podcast Generator:自动抓取新闻生成音频播客

AI Podcast Generator: Automatically Capturing News to Generate Audio Podcasts

General Introduction AI Podcast Generator is an intelligent podcast generation tool that utilizes advanced AI technology to automatically create engaging audio content from web sources. The system generates natural flowing narratives by capturing news content and converting it into audio podcasts. The project is based on Next...
11mos ago
023.6K
SciToolAgent:整合500+科研工具,自动化研究科研任务的智能体

SciToolAgent: Integration of 500+ research tools and automation of research and scientific tasks for intelligent bodies

Comprehensive Introduction SciToolAgent is an open source tool platform developed by the Innovation Center of Zhejiang University in Hangzhou (HICAI-ZJU). It integrates more than 500 scientific tools through knowledge graph (SciToolKG) and big language modeling technologies to help researchers deal with...
7mos ago
023.6K
InvSR:开源图像超分辨率项目,提升图像分辨率质量

InvSR: Open source image super-resolution project to improve the quality of image resolution

General Introduction InvSR is an innovative open-source image super-resolution project based on diffusion inversion techniques capable of converting low-resolution images into high-quality, high-resolution images. The project utilizes the rich a priori knowledge of images embedded in pre-trained large-scale diffusion models to support, through a flexible sampling mechanism, the...
10mos ago
023.6K
VideoChat:自定义形象和音色克隆的实时语音交互数字人,支持端到端语音方案和级联方案

VideoChat: real-time voice-interactive digital person with customized image and tone cloning, supporting end-to-end voice solutions and cascading solutions

Comprehensive Introduction VideoChat is a real-time voice interaction digital person project based on open source technology, supporting both end-to-end voice schemes (GLM-4-Voice - THG) and cascade schemes (ASR-LLM-TTS-THG). The project allows users to customize the digital ...
11mos ago
023.6K
Director:智能视频代理框架,用自然语言描述执行视频搜索、编辑和生成工作流

Director: Intelligent Video Agent Framework for Performing Video Search, Editing, and Generation Workflows with Natural Language Descriptions

General Introduction Director is an open source framework designed to simplify and optimize video interactions and workflows by building intelligent video agents. The framework is based on VideoDB's "video-as-data" infrastructure and is capable of handling complex video tasks such as searching, editing, compiling and generating...
10mos ago
023.5K
DCT-Net:照片和视频转绘为动漫风格化的开源工具

DCT-Net: An Open Source Tool for Transpainting Photos and Videos to Anime Stylization

Comprehensive Introduction DCT-Net is an open source project developed by DAMO Academy and Wang Xuan Institute of Computer Technology, Peking University, aiming at anime stylized transformation of images. The project utilizes deep learning techniques through Domain-Calibrated Translation (Domain-Calibrat...
9mos ago
023.5K
JoyGen:音频驱动的3D深度感知人像说话视频编辑工具

JoyGen: Audio-Driven 3D Depth-Sensitive Portrait Talking Video Editing Tool

Comprehensive Introduction JoyGen is an innovative two-stage video generation framework for talking faces, focusing on solving the problem of audio-driven facial expression generation. Developed by a team from Jingdong Technology, the project uses advanced 3D reconstruction techniques and audio feature extraction methods to accurately capture the identity characteristics of the speaker and the expression...
9mos ago
023.4K
HivisionIDPhotos:开源智能AI证件照制作工具

HivisionIDPhotos: open source intelligent AI photo ID creation tool

Comprehensive introduction HivisionIDPhotos is an open source lightweight AI document photo production tool, can intelligently identify the user photo scene and keying, to generate a standard document photo in line with a variety of specifications. The tool supports custom background color and size, the future will also introduce beauty and...
1yrs ago
023.4K
ChainForge:测试和评估大型语言模型提示效果的开源可视化编程环境

ChainForge: An Open Source Visual Programming Environment for Testing and Evaluating the Effectiveness of Large Language Model Hints

Comprehensive Introduction ChainForge is an open source visual programming environment designed for testing and evaluating the effectiveness of Large Language Model (LLM) cues. It provides a data flow cueing engineering environment through which users can quickly explore and analyze the quality of different cues on LLM response...
10mos ago
023.3K
Bambo:轻量灵活的智能体框架,简单配置角色和工具,处理多种负载任务

Bambo: a lightweight and flexible framework for intelligent bodies, with simple configuration of roles and tools to handle multiple loads of tasks

Comprehensive Introduction Bambo is a new type of proxy framework, which is lighter and more flexible than the mainstream frameworks and can handle a variety of load tasks.Bambo achieves efficient proxy functionality by defining all the tools in the tool catalog and using asynchronous custom functions. Users can use the llm_c...
10mos ago
023.3K
MOFA Video:运动场适配技术将静态图像转换为视频

MOFA Video: Motion Field Adaptation Technology Converts Still Images to Video

General Introduction MOFA-Video is a state-of-the-art image animation generation tool that utilizes generative motion field adaptation techniques to convert static images into dynamic videos. The project was developed in collaboration with the University of Tokyo and Tencent AI Lab, and will be presented at the 2024 European Conference on Computer Vision (E...
9mos ago
023.2K
Sketch-Gen:生成高质量线稿和草图,反推图像提示词,一键安装包

Sketch-Gen: Generate high-quality line drawings and sketches, backpropagate image cue words, one-click package installation

General Introduction Sketch-Gen is an AI technology-based line drawing and sketch generation tool designed to help artists and designers quickly generate high-quality line drawings and sketches. The tool is derived from the Paints-UNDO project and utilizes advanced machine learning models that can...
10mos ago
023.2K
LLManager:智能自动化流程审批与人类审核结合的管理工具

LLManager: a management tool that combines intelligent automated process approvals with human reviews

Comprehensive Introduction LLManager is an open source intelligent approval management tool, developed based on LangChain's LangGraph framework, focused on automating the processing of approval requests while optimizing decision making with human review. It does this through semantic search, sample less learning and...
6mos ago
023.2K
Research Rabbit:使用本地LLM进行网页研究和报告撰写,自动深入用户指定主题并生成总结。

Research Rabbit: Web research and report writing using native LLM, automatically drilling down into user-specified topics and generating summaries.

General Introduction Research Rabbit is a native LLM (Large Language Model) based web research and summarization assistant. After the user provides a research topic, Research Rabbit generates a search query, obtains relevant web results, and summarizes those results...
7mos ago
023.2K
DeepRant:实时翻译游戏聊天内容的开源客户端

DeepRant: An Open Source Client for Real-Time Translation of Game Chat Content

General Introduction DeepRant is an open source translation tool for gamers, designed to solve the problem of language barriers in international servers. It realizes instant translation of in-game text through shortcut keys, supports multiple languages to translate each other, and allows players to quickly understand and reply to chat messages without exiting the game...
7mos ago
023.1K
Story-Adapter:根据长篇故事生成连续且风格一致的图像插画

Story-Adapter: generating continuous and consistent graphic illustrations based on a long story

General Introduction Story-Adapter is an innovative story visualization framework that converts textual stories into coherent image sequences. Developed by researchers, this project employs an iterative approach that requires no training to generate high-quality story illustrations. The framework is characterized by its ability to handle long...
9mos ago
023.1K