Latest AI Resources

Total 2618 articles posts
WriteWise:喜马拉雅推出的专业AI小说写作工具

WriteWise: a professional AI novel writing tool from Himalaya

Comprehensive Introduction WriteWise is an online service platform focused on novel creation launched by Himalaya. It provides professional AI writing assistance, covering such things as persona setting, dialogue design and martial arts fighting. In addition, it also provides a computer version for download, supports rich editor format configuration as well as stable...
11mos ago
03.3K
Waifu2x Extension GUI:深度学习技术放大、修复图像与视频插帧(Windows x64)

Waifu2x Extension GUI: Deep Learning Techniques to Enlarge, Repair Image and Video Interpolation (Windows x64)

Comprehensive Introduction Waifu2x-Extension-GUI is a powerful image and video processing tool that utilizes deep convolutional neural network techniques to achieve super-resolution zoom and video frame interpolation for images, GIFs and videos. The tool supports multiple algorithms and engines, including Wai...
8mos ago
03.3K
ChatTTS:模仿真人说话声音的语音生成模型(ChatTTS一键加速包)

ChatTTS: a speech generation model that mimics the voice of a real person speaking (ChatTTS one-click acceleration package)

General Introduction ChatTTS is a generative speech model designed for conversational scenarios. It generates natural and expressive speech, supports multiple languages and multiple speakers, and is suitable for interactive conversations. The model does this by predicting and controlling fine-grained prosodic features such as laughter, pauses and interjections, sup...
6mos ago
03.2K
Ragas:评估RAG召回QA准确率与答案相关性

Ragas: assessing RAG recall QA accuracy and answer correlation

Comprehensive Introduction Ragas is a tool specifically designed to evaluate and optimize Retrieval Augmented Generation (RAG) systems. It provides a comprehensive set of evaluation metrics by analyzing the relationships between queries, retrieval contexts, and generated answers. These metrics include fidelity, answer relevance, context relevance, on...
7mos ago
03.2K
Akool:生成图像和视频营销素材|视频换脸|视频翻译|人像说话

Akool: Generate images and video marketing materials | Video Face Swap | Video Translation | Portrait Speak

General Introduction Akool is a focus on personalized visual marketing and advertising. Through advanced AI technology, AKOOL can help users easily create high-quality, personalized video content for a wide range of fields such as advertising, online education, art creation and e-commerce. It provides face transposition...
9mos ago
03.2K
VideoRAG:理解超长视频的RAG框架,支持多模态检索和知识图谱构建

VideoRAG: A RAG framework for understanding ultra-long videos with support for multimodal retrieval and knowledge graph construction

Comprehensive Introduction VideoRAG is a retrieval-enhanced generative framework designed for processing and understanding very long contextual videos. The tool combines a graph-driven textual knowledge base with hierarchical multimodal context encoding to efficiently process on a single NVIDIA RTX 3090 GPU...
6mos ago
03.2K
AI ContentCraft:生成短故事、对话脚本、配音、配图的多功能AI内容创作工具

AI ContentCraft: a versatile AI content creation tool for generating short stories, dialog scripts, voiceovers, and graphics

General Introduction AI ContentCraft is a versatile content creation tool that integrates text generation, speech synthesis, image generation and more. It helps creators quickly generate stories, podcast scripts, and accompanying audio and video content. The tool supports multiple language conversions and can batch...
7mos ago
03.2K
AI2SRT:利用 Gemini模型,一键为长视频创建解说短视频或视频总结

AI2SRT: Create short narrated videos or video summaries for long videos with one click using Gemini models

Comprehensive Introduction AI2SRT is an open source project that utilizes the GeminiAI Big Model to generate short narrated videos and video summaries for long videos with one click, while supporting audio and video transcription subtitles. The project aims to simplify the video content creation process and provide efficient subtitle generation and translation functions. Users can pass...
8mos ago
03.2K
飞书知识问答:使用飞书文档作为AI知识库

Flybook Knowledge Quiz: Using Flybook Documents as an AI Knowledge Base

Comprehensive Introduction Flying Book Knowledge Q&A is an AI-driven knowledge management and Q&A tool launched by Flying Book, which deeply integrates DeepSeek R1 big model technology. It supports real-time networking search, multi-format file parsing (including documents, images, etc.), and can seamlessly dock the enterprise knowledge base to help use...
5mos ago
03.2K
VideoLingo:视频转录单词级时间轴字幕,视频字幕翻译和本地化配音开源工具

VideoLingo: video transcription word-level timeline subtitles, video subtitle translation and localized dubbing open source tools

General Description VideoLingo is a one-stop video translation and localization dubbing tool designed to generate Netflix-grade, high-quality subtitles, eliminating raw machine translation and multi-line subtitles, and adding high-quality voiceovers that enable global knowledge to be shared across language barriers. By...
10mos ago
03.2K
Resemble AI:人工智能语音合成平台|声音克隆|深度伪造音频检测

Resemble AI: Artificial Intelligence Speech Synthesis Platform | Voice Cloning | Deep Fake Audio Detection

Comprehensive Introduction Resemble AI is an artificial intelligence speech synthesis platform designed for the enterprise. The platform provides cutting-edge AI voice generator technology and deep forged audio detection for future information security. Features include voice cloning, real-time deep fake audio detection, AI watermarking technology...
10mos ago
03.2K
xyks:小猿口算逆向笔记,逆向工程与解密算法

xyks: small ape oral math reverse notes, reverse engineering and decryption algorithms

Comprehensive Introduction Ape Mouth Calculator Reverse Notes is an open source project that aims to document and share the process and methods of reverse engineering the Ape Mouth Calculator application. The project contains a variety of reverse tools and techniques to use the instructions , such as Frida, dexdump , etc., to help users understand and crack the little ape oral math add...
10mos ago
03.2K
MatAnyone: 提取视频指定目标人像的开源工具,生成目标人像视频

MatAnyone: Extract video to specify the target portrait of the open-source tool to generate the target portrait video

General Introduction MatAnyone is an open source project focusing on video keying, developed and released on GitHub by a research team at S-Lab, Nanyang Technological University, Singapore. It provides users with stable and efficient video processing capabilities through coherent memory propagation techniques, especially...
6mos ago
03.2K
Kheish:多角色智能体,审查、验证和格式化输出以生成高质量结果

Kheish: multi-actor intelligences that review, validate and format output to produce high quality results

Comprehensive Introduction Kheish is an open source multi-role agent designed for Large Language Model (LLM) tasks that require structured, step-by-step collaboration.Kheish is more than just a simple coordinator, it is an intelligent agent in its own right, requesting modules on demand, integrating user-reversal...
7mos ago
03.2K
Ultravox:实时端到端语音对话的音频多模态大模型,GPT-4o语音交互的开源实现

Ultravox: an audio multimodal macromodel for real-time end-to-end voice dialog, an open source implementation of GPT-4o voice interaction

Comprehensive Introduction Ultravox is an innovative multimodal Large Language Model (LLM) designed for real-time speech processing. Unlike traditional speech recognition systems, Ultravox eliminates the need for a separate Audio Speech Recognition (ASR) stage, and is able to directly convert audio into high-dimensional space in...
8mos ago
03.2K
TransRouter:基于Gemini多模态模型,实时中英互译的音频转换工具

TransRouter: A Real-Time Audio Conversion Tool for Chinese-to-English Translation Based on Gemini Multimodal Modeling

TransRouter is a real-time voice translation tool based on Google's Gemini model, specifically designed for real-time voice translation between English and Chinese. The tool can be seamlessly integrated into video conferencing software such as Zoom, providing an easy way for cross-language...
7mos ago
03.2K
MimicPC:在线AI生成器,提供多种预安装AI应用,海外版端脑云

MimicPC: online AI generator, offering a wide range of pre-installed AI applications, overseas version of Endbrain Cloud

General Introduction MimicPC is an online AI generator platform that provides a wide range of pre-installed AI applications that users can use without complicated installation steps. The platform supports image generation, facial fusion, e-commerce modeling and many other features for users with different needs.MimicPC...
9mos ago
03.2K
SP-MangaEditer:专业四格漫画插图创作工具,生成图像、编辑漫画页面

SP-MangaEditer: Professional four-panel manga illustration creation tool, generating images, editing manga pages

General Introduction SP-MangaEditer is an independent manga editing platform designed for manga creators. The platform supports image generation, layer editing, image adjustment, filter application and many other functions to help users easily create high-quality manga illustrations. Users can simply manipulate...
7mos ago
03.2K
问小白:提供工作和生活帮助的全能AI助手,集成满血DeepSeek-R1

Ask White: an all-around AI assistant that provides work and life help with integrated full-blooded DeepSeek-R1

Comprehensive Introduction AskSeek is an AI intelligent assistant (including web-side and APP-side) developed by Yuanshi Technology, based on the self-developed Yuanshi Big Model, currently integrating the latest DeepSeek-R1 model, aiming to simplify the user's through quick Q&A, intelligent search, text creation, and other...
3mos ago
03.2K
音剪:喜马拉雅自然人声、多人旁白音频创作平台

Sound clipping: Himalaya's natural human voice, multi-narrator audio creation platform

Comprehensive Introduction Himalaya Audio Editor is a comprehensive AI audio creation platform. It offers powerful features that support users with professional-grade podcast production, multi-track recording, audio editing, and the ability to convert text to speech. The platform also contains multiple options for professional voice, helping users...
1yrs ago
03.2K
MakeSense:免费使用的图像标注工具,提升计算机视觉项目效率

MakeSense: a free-to-use image annotation tool to improve computer vision project efficiency

General Introduction Make Sense is a free online image annotation tool designed to help users quickly prepare datasets for computer vision projects. It requires no complicated installation, just open a browser access to use it, supports multiple operating systems, and is perfect for small deep learning projects. Users can...
6mos ago
03.2K
BotSharp:基于.NET的多智能体AI应开发与管理平台

BotSharp: .NET-based multi-intelligence body AI should development and management platform

Comprehensive Introduction BotSharp is an open source project based on .NET Core dedicated to providing a comprehensive AI chatbot platform building tool. It uses C# programming, supports cross-platform operation, and aims to simplify the application of machine learning algorithms, enabling enterprise-level developers to efficiently ...
7mos ago
03.2K
GPTme:在命令行终端中运行的智能编程助手,ChatGPT代码解释器的本地化替代方案

GPTme: Intelligent Programming Assistant Running in a Command Line Terminal, Localized Alternative to ChatGPT Code Interpreter

Comprehensive Introduction GPTMe is a revolutionary terminal AI assistant tool designed to enhance developers' work efficiency. It perfectly combines powerful AI capabilities with the terminal environment, supporting diverse functions such as code execution, file editing, web browsing and visual recognition. As ChatGPT code solving...
8mos ago
03.2K