General Introduction Text2Video-Zero is an official implementation of a zero-sample text-to-video generator for GitHub developed by the Picsart AI Research team.The project provides a new way to use text cues to generate videos with temporal consistency and correctly followed text cues. The team has also released...
Comprehensive Introduction Retrieval based Voice Conversion WebUI is a simple and easy-to-use VITS-based voice conversion framework, which can realize voice conversion between any speakers, including song covers and real-time voice changing. It features low latency, excellent voice changing effect, small amount of data training...
Enable Builder Smart Programming Mode, unlimited use of DeepSeek-R1 and DeepSeek-V3, smoother experience than the overseas version. Just enter the Chinese commands, even a novice programmer can write his own apps with zero threshold.
Comprehensive Introduction VoiceCraft is an open source speech editing and zero-sample speech synthesis tool based on the Neural Codec language model. It employs an innovative coded sequence generation method that enables insertion, deletion and replacement operations on existing speech sequences to generate natural and coherent edited speech. At the same time, ...
General Introduction edge-tts is an open source Python module that allows users to use Microsoft Edge's online text-to-speech service in Python code without the need for a Microsoft Edge browser, Windows operating system, or API key. Provides direct use of edge-tts from the command line and edge-...
General Introduction CoAI.Dev (formerly Chat Nio) is a chat platform that integrates multiple AI models and supports distributed streaming, image generation, cross-device conversation synchronization and sharing. It implements a subscription and Token billing system, Key transit service and multi-model support, and also includes connected search and AI...
Comprehensive introduction ChatOllama is an open source online chat application project based on a large language model (LLM) , supporting numerous language models and knowledge base management. Users can use the platform for model management ( list display , download , delete ) , chat with the model and other functions . The project utilizes the Nuxt 3 framework ...
Comprehensive Introduction MinerU is an open source data extraction tool developed by the OpenDataLab team at the Shanghai Artificial Intelligence Laboratory, focusing on efficiently extracting content from complex PDF documents, web pages, and eBooks. It can convert multimodal PDF documents containing images, formulas, tables and other elements into easy-to-analyze M...
Comprehensive Introduction DCT-Net is an open source project developed by DAMO Academy and Wang Xuan Institute of Computer Technology, Peking University, aiming at anime stylized transformation of images. The project utilizes deep learning techniques by means of Domain-Calibrated Translation (DCT) to...
General Introduction Diffusers Image Outpaint is a powerful AI image expansion tool created by Hugging Face community member fffiloni. The tool utilizes advanced diffusion modeling techniques to seamlessly expand an image (outpaint the edges of the image) to produce a high-quality image...
Comprehensive introduction Tap4 AI WebUI is an open source lightweight AI tool navigation website project , designed to help users easily build their own AI tool catalog . The project uses Next.js and Supabase technology stack , support for multi- language SEO optimization , to provide AI tools classification filtering , search and detailed display functions ...
CodeFormer General Introduction CodeFormer is a codebase for robust blind face repair, developed by a team of researchers at S-Lab, Nanyang Technological University and presented at NeurIPS 2022. The project utilizes the Codebook Lookup Transformer technology, which aims to improve...
Comprehensive Introduction GFPGAN (Generative Facial Prior GAN) is an open source face repair algorithm developed by Tencent ARC (Applied Research Center). The algorithm utilizes rich and diverse prior factors encapsulated in pre-trained facial GANs (e.g., StyleGAN2) for blind face repair.G...
General Introduction Curiosity is a project aimed at exploration and experimentation, primarily using the LangGraph and FastHTML technology stacks, with the goal of building a Perplexity AI-like search product. At the heart of the project is a simple ReAct Agent that utilizes Tavily search to enhance text generation...
Comprehensive Introduction Moshi Chat is an end-to-end real-time AI voice assistant from Kyutai, a French non-profit AI lab. It not only listens in real-time, but also engages in natural conversations and supports multimodal interactions, including the ability to see, hear, and speak.Moshi Chat understands the user's intonation and can be used in...
QAnything General Introduction QAnything (Question and Answer based on Anything) is a local knowledge base Q&A system launched by NetEase, which supports all kinds of file formats and databases and can be installed and used offline. It can handle PDF, Word, PPT, XLS and other formats of documents, support for cross...
General Introduction stickerbaker is an open source sticker maker that utilizes artificial intelligence technology to create a variety of interesting stickers. Whether you want a simple cat sticker or want to make a diverse range of stickers, stickerbaker has you covered. Just simply describe the sticker you want...
General Introduction ALog is an AI-based voice diary application designed to help users record their daily lives by voice. The project is developed by duxins and open-sourced on GitHub. Users can record their diary through voice input, and the app will automatically convert the voice to text and analyze it intelligently...
Comprehensive Introduction OpenSPG is an open source knowledge graph engine developed by Ant Group in collaboration with OpenKG, based on the SPG (Semantic Augmented Programmable Graph) framework. The engine is designed to provide features such as explicit semantic representation, logical rule definition and operational framework to support the construction and management of domain knowledge graphs.OpenSPG combines...
General Introduction Mem0 (pronounced "mem-zero") is an open source project that provides an intelligent memory layer for AI assistants and agents. It remembers user preferences, adapts to individual needs, and improves over time, making it ideal for customer-supported chatbots, AI assistants, and autonomous systems.Me...