AI open source project

Total 1020 articles posts
TryOffAnyone:从人物身上提取服装为平铺服装展示图的AI工具

TryOffAnyone: AI tool for extracting garments from a person as a tiled garment display image

Comprehensive Introduction TryOffAnyone is a breakthrough AI image processing tool specialized in solving the challenges of clothing display in the e-commerce field. It is able to intelligently convert photos of clothes in real people's wearing state into lay-flat display effect images, this technology is based on the latest Latent Dif...
9mos ago
023.1K
Agent Laboratory:为研究人员提供自动化代码及研究报告撰写助手

Agent Laboratory: automated code and study writing assistant for researchers

Comprehensive Introduction Agent Laboratory is an end-to-end autonomous research workflow designed to help researchers realize their research ideas. The system consists of dedicated agents driven by large language models that support the entire research workflow - from conducting literature reviews and developing plans to executing...
7mos ago
023.1K
OmAgent:构建多模态智能设备的智能体框架

OmAgent: an intelligent body framework for building multimodal smart devices

Comprehensive Introduction OmAgent is a multimodal intelligent body framework developed by Om AI Lab, aiming to provide powerful AI-powered features for smart devices. By integrating state-of-the-art multimodal base models and intelligent body algorithms, the project enables developers to create efficient smart devices on a variety of...
9mos ago
023.1K
Goku: 生成画面精细且一致的视频,适合创作包含人物、物体细节的广告视频

Goku: Generates detailed and consistent videos, ideal for creating commercials with detailed characters and objects.

Comprehensive Introduction Goku is a federated image and video generation model based on stream transformation techniques designed to achieve industry-grade performance. It integrates advanced high-quality visual generation techniques, including fine-grained data organization, model design, and stream transform formulation.Goku's main contributions include high-quality fine-grained...
8mos ago
023.1K
Mahilo:连接不同AI智能体框架实时协作的集成平台

Mahilo: an integrated platform for connecting different AI intelligences frameworks to collaborate in real time

General Introduction Mahilo is an open source multi-intelligence integration platform, released on GitHub by developer Jayesh Sharma, designed to help users connect AI intelligences from different frameworks to support real-time communication, human-computer interaction, and intelligent collaboration. The ...
8mos ago
023.1K
Magic 1-For-1: 高效生成视频的开源项目,号称在一分钟内生成一分钟的视频

Magic 1-For-1: efficient generation of video open source project that claims to generate a minute of video in one minute

Comprehensive Introduction Magic 1-For-1 is an efficient video generation model designed to optimize memory usage and reduce inference latency. The model decomposes the text-to-video generation task into two subtasks: text-to-image generation and image-to-video generation, enabling more efficient training and distillation...
8mos ago
023K
Confident AI:自动化大语言模型评估框架,对比不同大模型提示词输出质量

Confident AI: A Framework for Automated Large Language Model Evaluation, Comparing the Output Quality of Different Large Model Cue Words

Comprehensive Introduction DeepEval is an easy-to-use open source LLM evaluation framework for evaluating and testing large language modeling systems. It is similar to Pytest, but focuses on unit testing of LLM output.DeepEval combines the latest research results through G-Eval, phantom...
8mos ago
022.9K
FoloUp:开源AI语音面试平台,生成定制面试题并进行智能分析

FoloUp: Open Source AI Voice Interview Platform Generates Customized Interview Questions and Performs Intelligent Analysis

General Introduction FoloUp is an open source platform that specializes in AI-powered voice interview solutions for enterprises. With FoloUp, enterprises can quickly generate customized interview questions for job descriptions and conduct natural conversational interviews with AI. The platform also provides detailed interview analysis...
7mos ago
022.9K
Crawl4LLM:为LLM预训练提供的高效网页爬取工具

Crawl4LLM: An Efficient Web Crawling Tool for LLM Pretraining

Comprehensive Introduction Crawl4LLM is an open source project jointly developed by Tsinghua University and Carnegie Mellon University, focusing on optimizing the efficiency of web crawling for pre-training of large models (LLM). It significantly reduces ineffective crawling by intelligently selecting high-quality web page data, claiming to be able to originally need to crawl 1...
8mos ago
022.9K
PromptWizard:优化提示工程的开源框架,提升任务性能

PromptWizard: an open source framework for optimizing prompt projects to improve task performance

Comprehensive Introduction PromptWizard is an open source framework developed by Microsoft that uses a self-evolutionary mechanism that allows the model to generate, evaluate, and improve prompt words and generate examples on its own, improving the quality of the output through continuous feedback. It can autonomously optimize the prompt words, generate and select appropriate examples, and...
10mos ago
022.8K
Mini-Cover:在线封面制作,专为博客、短视频、社交媒体等生成个性化封面

Mini-Cover: online cover creation, designed to generate personalized covers for blogs, short videos, social media and more

General Introduction Mini-Cover is an open source online cover generation tool designed to generate personalized covers for platforms such as blogs, short videos and social media. Developed by JLinMr, the tool aims to provide a simple and efficient solution to help users quickly generate covers that meet their needs...
10mos ago
022.8K
Unigraph:构建本地运行的知识图谱和个人搜索引擎

Unigraph: building locally running knowledge graphs and personal search engines

Comprehensive Introduction Unigraph is a local-first general-purpose knowledge graph and personal search engine designed to provide users with an integrated workspace to help manage and search for a wide variety of data in their personal lives. With Unigraph, users can integrate data from different sources into a...
9mos ago
022.8K
muAgent:由 LLM 和 EKG(行业知识)驱动的全新Agent编排框架

muAgent: A New Agent Orchestration Framework Driven by LLM and EKG (Industry Knowledge)

General Introduction muAgent is an innovative multi-intelligentsia framework developed by Ant Group. The framework collaborates with multi-intelligentsia, function calls, code interpreters and other technologies through canvas drag-and-drop and simple text writing to help users execute various complex standard operating procedures (SOPs) under human guidance...
11mos ago
022.8K
TankWork:通过语音和文字操作电脑,并提供实时语音反馈的智能体

TankWork: an intelligent body that operates computers via voice and text and provides real-time voice feedback

General Introduction TankWork is an open source desktop agent framework designed to enable AI to perceive and control your computer through computer vision and system-level interaction. The framework allows agents to directly control computers through voice and text commands, process real-time screen content, and provide continuous audio visual...
9mos ago
022.8K
CR-Mentor:知识库+LLM 驱动的GitHub智能代码审查导师

CR-Mentor: Knowledge Base + LLM Driven Intelligent Code Review Mentor for GitHub

Comprehensive Introduction CR-Mentor is an intelligent code review tool that combines a specialized knowledge base with the power of Large Language Modeling (LLM). It not only supports code review for all programming languages, but also customizes exclusive review criteria and focus areas for teams based on best practices accumulated in the knowledge base. Through...
11mos ago
022.8K
HealthGPT:支持医学图像分析与诊断问答的医疗大模型

HealthGPT: A Medical Big Model to Support Medical Image Analysis and Diagnostic Q&A

Comprehensive Introduction HealthGPT is a state-of-the-art medical grand visual language model designed to enable unified medical visual understanding and generation capabilities through heterogeneous knowledge adaptation. The goal of the project is to integrate medical visual understanding and generation capabilities into a unified autoregressive framework that significantly improves the medical graph...
8mos ago
022.7K
IMS Toucan:快速可控的多语言(支持7000+语言)文本转语音工具

IMS Toucan: Fast and Controllable Multilingual (7000+ languages supported) Text-to-Speech Tool

General Introduction IMS Toucan is a state-of-the-art text-to-speech (TTS) toolkit developed by the Institute for Natural Language Processing (IMS) at the University of Stuttgart, Germany. The toolkit supports more than 7000 languages and is characterized by fast, controllable and low computational resource requirements.IMS...
8mos ago
022.7K
Marco-o1:基于Qwen2-7B-Instruct微调的开源版OpenAI o1模型,探索开放式推理模型,解决复杂问题

Marco-o1: An Open Source Version of the OpenAI o1 Model Based on Qwen2-7B-Instruct Fine-Tuning to Explore Open Inference Models for Solving Complex Problems

Comprehensive Introduction Marco-o1 is an open reasoning model developed by Alibaba International Digital Commerce Group (AIDC-AI) to solve complex real-world problems. The model combines Chain of Thought (CoT) fine-tuning, Monte Carlo Tree Search (MCTS), and innovative reasoning strategies...
10mos ago
022.7K
Pyramid Flow:快手推出的开源版

Pyramid Flow: an open source version of "Kringle" launched by Racer, based on SD3 and running on GPUs of less than 8GB (one-click deployment version)

Comprehensive Introduction Pyramid Flow is an efficient autoregressive video generation method based on the Flow Matching technique. The method achieves higher computational efficiency in generating and decompressing video content by interpolating between different resolutions and noise levels...
11mos ago
022.7K
ScrapeGraphAI:一个提示词搞定网页抓取,无需编写规则智能网页内容提取工具

ScrapeGraphAI: A single cue word for web crawling, no need to write rules intelligent web content extraction tools

Comprehensive Introduction ScrapeGraphAI is an innovative Python web crawling library that cleverly combines Large Language Modeling (LLM) and Direct Graph Logic to create crawling pipelines for websites and local documents. The uniqueness of this tool lies in its perfect level of simplicity and power...
9mos ago
022.7K
OmniParse:从文档/多媒体中提取任何非结构化数据解析为结构化数据

OmniParse: extract any unstructured data from documents/multimedia and parse it into structured data

Comprehensive Introduction OmniParse is a powerful data parsing and optimization platform designed to convert any unstructured data into structured, actionable data optimized for GenAI (Generative Artificial Intelligence) framework. Whether you are working with documents, tables, images, videos, audio files or...
11mos ago
022.7K
Swarm:学习轻量级多智能体系统的实验性教学项目(OpenAI示例)

Swarm: an experimental pedagogical program for learning lightweight multi-intelligent body systems (OpenAI example)

General Introduction Swarm is an experimental educational framework developed by OpenAI to explore lightweight, controlled, and easy-to-test interfaces for multi-agent systems. The framework is primarily used to demonstrate handoffs and routine patterns between agents to help developers understand and implement the coordination and execution of multi-agent systems...
9mos ago
022.7K
Ant Design X:快速构建AI聊天界面的工具包,支持模型集成和数据流管理。

Ant Design X: A toolkit for rapidly building AI chat interfaces with support for model integration and data flow management.

Comprehensive Introduction Ant Design X is a toolkit open-sourced by Ant Group, designed to help developers quickly build AI-driven dialog interfaces. It provides a rich set of components and templates, supports model integration compatible with OpenAI standards, and is suitable for a variety of applications such as intelligent customer service, AI assistants, and other...
11mos ago
022.6K
GPT Academic:最佳Arxiv学术论文翻译、纠错与代码解释

GPT Academic: Best Arxiv Academic Paper Translation, Error Correction and Code Interpretation

Comprehensive Introduction GPT Academic is a large language model interaction platform optimized for academic research, providing tools for pragmatic interaction interfaces for large language models such as GPT/GLM, specifically optimized for paper translation, paper reading, touch-ups and writing experience. It uses a modular design...
11mos ago
022.5K