Latest AI Resources

Total 2830 articles posts
Step-Audio:多模态语音交互框架,识别语音并使用克隆语音交流等功能

Step-Audio: a multimodal voice interaction framework that recognizes speech and communicates using cloned speech, among other features

Comprehensive Introduction Step-Audio is an open source intelligent speech interaction framework designed to provide out-of-the-box speech understanding and generation capabilities for production environments. The framework supports multi-language dialog (e.g., Chinese, English, Japanese), emotional speech (e.g., happy, sad), regional dialects (e.g., Cantonese, Szechuan ...
9mos ago
041.5K
心流AI助手:深度知识搜索工具,集成知识库的专业知识研究助手

Mindstream AI Assistant: Deep Knowledge Search Tool, Expertise Research Assistant with Integrated Knowledge Base

Comprehensive Introduction Heartstream AI Assistant is an intelligent search and knowledge acquisition tool designed to help users efficiently acquire all kinds of knowledge, whether it's daily life encyclopedias or professional academic papers. With Mindstream AI Assistant, users can easily search the whole Internet content, quickly find the information they need, and enter the efficient Mindstream state...
9mos ago
037.3K
Influencer AI:快速生成病毒式UGC广告

Influencer AI: Generate viral UGC ads fast!

General Introduction Influencer AI is a platform that utilizes artificial intelligence technology to generate user-generated content (UGC) ads. The platform creates high-converting ads through AI virtual influencers without the need to actually shoot or sign contracts. Users simply provide a link to a website and AI generates...
9mos ago
029.7K
FoloUp:开源AI语音面试平台,生成定制面试题并进行智能分析

FoloUp: Open Source AI Voice Interview Platform Generates Customized Interview Questions and Performs Intelligent Analysis

General Introduction FoloUp is an open source platform that specializes in AI-powered voice interview solutions for enterprises. With FoloUp, enterprises can quickly generate customized interview questions for job descriptions and conduct natural conversational interviews with AI. The platform also provides detailed interview analysis...
9mos ago
030.6K
免费在线数字人生成工具,支持声音、数字分身克隆和视频去水印

Free online digital person generation tool with support for sound, digital split cloning and video watermark removal

General Introduction Digital Person Generation System is a website that provides free digital person generation service. The site supports sound cloning, sound reproduction, digital person image template, digital split cloning, video watermark removal and other functions, aiming to provide users with efficient and convenient digital person generation solutions. Users can upload...
6mos ago
032K
Confident AI:自动化大语言模型评估框架,对比不同大模型提示词输出质量

Confident AI: A Framework for Automated Large Language Model Evaluation, Comparing the Output Quality of Different Large Model Cue Words

Comprehensive Introduction DeepEval is an easy-to-use open source LLM evaluation framework for evaluating and testing large language modeling systems. It is similar to Pytest, but focuses on unit testing of LLM output.DeepEval combines the latest research results through G-Eval, phantom...
9mos ago
031.9K
PraisonAI:低代码多智能体框架,简化复杂任务的自动化解决方案

PraisonAI: A Low-Code Multi-Intelligent Body Framework to Simplify Automation Solutions for Complex Tasks

Comprehensive Introduction PraisonAI is an out-of-the-box multi-intelligence body framework for production environments, designed to create AI intelligences to automate and solve problems ranging from simple tasks to complex challenges. The framework provides a low-code solution that simplifies the building of multi-intelligent body LLM systems and...
9mos ago
029.7K
HN中文播客:自动抓取热门科技文章,AI生成中文总结并转换为播客

HN Chinese Podcast: Automatically grab popular tech articles, AI-generated Chinese summaries and convert them to podcasts

General Introduction The Hacker News Chinese Podcast project is an innovative platform based on AI technology, aiming to automatically grab popular articles on Hacker News every day and generate Chinese summaries and podcast content through AI. The project is led by ccbikai ...
10mos ago
033.1K
LangGraph Supervisor:利用监督智能体来管理多智能体协作的工具

LangGraph Supervisor: a tool for managing multi-intelligence collaboration using supervising intelligences

Comprehensive Introduction LangGraph Supervisor is a Python library based on the LangGraph framework, designed for creating and managing multi-intelligent body systems. The library coordinates the work of multiple specialized agents through a central supervisory agent, ensuring that communication flows and tasks are divided...
10mos ago
034.1K
Deep Research:基于AI的深度研究助手,提供高效的研究工具和报告生成功能

Deep Research: an AI-based deep research assistant that provides efficient research tools and report generation capabilities

General Introduction Deep Research is an AI-based research assistant designed to perform iterative deep research by combining search engines, web crawling, and large language models. The project was released by dzhng on GitHub with the goal of providing an easy-to-use deep research genera...
8mos ago
030.7K
wdoc:从海量、多源文档中检索内容并总结知识

wdoc: retrieve content and summarize knowledge from massive, multi-source documents

Comprehensive Introduction wdoc is a powerful RAG (Retrieval Augmentation Generation) system designed for processing and analyzing large and diverse documents. It is capable of retrieving from a wide range of document types, including PDFs, web pages, YouTube videos, audio files, etc. wdoc is particularly well suited for processing...
10mos ago
032.1K
Magic 1-For-1: 高效生成视频的开源项目,号称在一分钟内生成一分钟的视频

Magic 1-For-1: efficient generation of video open source project that claims to generate a minute of video in one minute

Comprehensive Introduction Magic 1-For-1 is an efficient video generation model designed to optimize memory usage and reduce inference latency. The model decomposes the text-to-video generation task into two subtasks: text-to-image generation and image-to-video generation, enabling more efficient training and distillation...
10mos ago
034.1K
FinRobot:提升金融数据分析效率和投资研究的的智能体

FinRobot: An Intelligent Body to Improve Financial Data Analysis Efficiency and Investment Research

Comprehensive Introduction FinRobot is an open source AI intelligence platform developed by AI4Finance Foundation and designed for financial analytics. It not only covers traditional language models, but also incorporates a variety of AI technologies, aiming to provide a comprehensive solution for the financial industry.F...
10mos ago
041.2K
LocalPdfChatRAG:支持本地多源PDF文档问答的智能聊天工具

LocalPdfChatRAG: Intelligent Chat Tool to Support Local Multi-Source PDF Document Q&A

Comprehensive Introduction LocalPdfChatRAG is an open source project that aims to implement intelligent chat functionality by combining local PDF documents with Retrieval Augmented Generation (RAG) models. The project allows users to upload PDF documents and ask questions through natural language to get from the document to the relative ...
10mos ago
027.8K
问小白:提供工作和生活帮助的全能AI助手,集成满血DeepSeek-R1

Ask White: an all-around AI assistant that provides work and life help with integrated full-blooded DeepSeek-R1

Comprehensive Introduction AskSeek is an AI intelligent assistant (including web-side and APP-side) developed by Yuanshi Technology, based on the self-developed Yuanshi Big Model, currently integrating the latest DeepSeek-R1 model, aiming to simplify the user's through quick Q&A, intelligent search, text creation, and other...
6mos ago
042.3K
Goku: 生成画面精细且一致的视频,适合创作包含人物、物体细节的广告视频

Goku: Generates detailed and consistent videos, ideal for creating commercials with detailed characters and objects.

Comprehensive Introduction Goku is a federated image and video generation model based on stream transformation techniques designed to achieve industry-grade performance. It integrates advanced high-quality visual generation techniques, including fine-grained data organization, model design, and stream transform formulation.Goku's main contributions include high-quality fine-grained...
10mos ago
030.6K
Kamili:AI智能评估网站质量并给出优化建议

Kamili: AI Intelligence Assesses Website Quality and Gives Optimization Advice

General Introduction Kamili is a tool that uses artificial intelligence technology to provide website optimization advice designed to help users improve the performance, user experience and SEO performance of their websites. Through a simple three-step process, users can enter a link to their website, set goals, get a detailed optimization plan, and immediately see...
9mos ago
030.3K
Meetily:生成会议纪要的AI助手,实时转录和生成会议摘要

Meetily: an AI assistant for generating meeting minutes, transcribing and generating meeting summaries in real-time

General Description Meetily is an AI-powered meeting assistant developed by Zackriya Solutions that captures meeting audio in real-time, performs voice transcription, and generates meeting summaries. It is unique in that all processing is done locally on the device, ensuring user privacy...
10mos ago
070.5K
沉浸式翻译插件:免费多语言实时网页翻译工具,PDF/EPUB/视频字幕全支持

Immersive Translation Plugin: Free multi-language real-time web page translation tool, PDF/EPUB/video subtitle full support

Comprehensive Introduction Immersive Translator is a free and powerful browser plug-in designed to break down language barriers and help you read global information easily. It provides multi-language real-time web page translation services, supports dozens of languages to translate each other, and breaks through the limitations of traditional web page translation to extend the function to PDF documents, E...
8mos ago
040.8K
小半 WordPress AI 助手:实现对话、文章生成与翻译的 WordPress AI助手插件

Little Half WordPress AI Assistant: A WordPress AI Assistant Plugin for Conversation, Post Generation and Translation

Comprehensive Introduction WordPress AI Assistant Plugin (wp-ai-chat) is an open source WordPress plugin designed to provide users with a variety of AI features, including AI conversations, article generation, article summarization, article translation and content reading. The plugin supports docking multiple ...
10mos ago
032.4K
LiberSonora:有声书字幕提取与多语言翻译,有声小说转录为多语言

LiberSonora: Audiobook Subtitle Extraction and Multilingual Translation, Audiobook Transcription into Multiple Languages

General Introduction LiberSonora, which means "free sound", is a powerful AI-enabled open source audiobook toolset. The toolset supports intelligent subtitle extraction, AI title generation, multi-language translation, etc., and is capable of batch offline processing under GPU acceleration.LiberSo...
10mos ago
029.9K
VideoRAG:理解超长视频的RAG框架,支持多模态检索和知识图谱构建

VideoRAG: A RAG framework for understanding ultra-long videos with support for multimodal retrieval and knowledge graph construction

Comprehensive Introduction VideoRAG is a retrieval-enhanced generative framework designed for processing and understanding very long contextual videos. The tool combines a graph-driven textual knowledge base with hierarchical multimodal context encoding to efficiently process on a single NVIDIA RTX 3090 GPU...
10mos ago
034.3K
MedRAX: 利用多模态大模型进行胸部X光片分析的智能体

MedRAX: A Smart Body for Chest X-ray Analysis Using Multimodal Large Models

Comprehensive Introduction MedRAX is a state-of-the-art AI intelligence designed for chest radiograph (CXR) analysis. It integrates state-of-the-art CXR analysis tools and multimodal large language models to dynamically process complex medical queries without additional training.MedRAX, through its modular design...
9mos ago
036.5K
zChunk:基于Llama-70B的通用语义分块策略

zChunk: a generic semantic chunking strategy based on Llama-70B

Comprehensive Introduction zChunk is a novel chunking strategy developed by ZeroEntropy that aims to provide a solution for generic semantic chunking. The strategy is based on the Llama-70B model, which optimizes the chunking process of documents by prompting for chunks to be generated, ensuring that information retrieval is maintained at a high...
10mos ago
031.7K
Hibiki:实时语音翻译模型,保留原声特点的流式翻译

Hibiki: a real-time speech translation model, streaming translation that preserves the characteristics of the original voice

General Introduction Hibiki is a high-fidelity real-time speech translation model developed by Kyutai Labs. Unlike traditional offline translation, Hibiki is able to generate natural speech translation in the target language and provide text translation in real time while the user is speaking. The model...
10mos ago
035.9K
Pulse:文档处理与数据提取的商业解决方案

Pulse: Business Solutions for Document Processing and Data Extraction

Comprehensive Introduction Pulse is an intelligent platform focused on document processing and data extraction, designed to help organizations and developers efficiently parse and process a wide range of complex documents. Through its advanced computer vision and multimodal processing technology, Pulse is able to accurately extract data from text, images, tables, and many other...
10mos ago
031K