Comprehensive Introduction RolmOCR is an open source Optical Character Recognition (OCR) tool developed by Reducto AI team, based on Qwen2.5-VL-7B visual language model. It can extract text from images and PDF files faster than similar tools olmOCR, lower memory footprint.RolmOCR does not obe...
Comprehensive Introduction KrillinAI is an open-source video processing tool focused on using artificial intelligence to help users translate videos and automatically dub them. It can start from the video download, all the way to generating a finished product adapted to different platforms, all in just a few clicks. The developers provide free code on GitHub, and users can...
Enable Builder Smart Programming Mode, unlimited use of DeepSeek-R1 and DeepSeek-V3, smoother experience than the overseas version. Just enter the Chinese commands, even a novice programmer can write his own apps with zero threshold.
Comprehensive introduction AiryLark is an open source document processing and translation tool , hosted on GitHub , built by the developer wizd based on the Next.js framework . It supports a variety of file formats (such as PDF, Word, TXT, Markdown) input and processing , while providing intelligent translation capabilities. Users can ...
General Introduction Zola is a free and open source AI chat application developed by developer Julien Thibeaut (GitHub username ibelick) and hosted on GitHub. Its best feature is that it supports multiple AI models, such as OpenAI and Mistral, giving users the freedom to choose between different...
Comprehensive Introduction DeepResearcher is an open source project developed by the GAIR-NLP team at Shanghai Jiao Tong University. It is an intelligent research tool based on Large Language Models (LLMs) with end-to-end training in a real network environment through Reinforcement Learning (RL). The project aims to help users efficiently complete complex research...
AnimeGamer is an open source tool launched by Tencent ARC Lab. Users can generate anime videos with simple verbal commands, such as "Sousuke drive around in a purple car", and also allow different anime characters to interact with each other, such as Kiki from Magical Girl's House, and Pazuzu from Castle in the Sky meeting. It...
General Introduction Lumina-mGPT-2.0 is an open source project jointly developed by Shanghai AI Laboratory (Shanghai AI Laboratory), Chinese University of Hong Kong (CUHK), and other organizations, hosted on GitHub, and maintained by Alpha-VLLM team. It is a standalone autoregressive model from scratch...
General Introduction Agent S is an open source framework developed by Simular AI that lets intelligences operate computers like humans through a graphical user interface (GUI). It uses a multimodal large language model and empirical learning techniques to perform tasks such as browsing the web, editing documents, and using software. The project is on GitHub...
General Introduction BabelDOC is an open source tool designed to translate PDF documents into a bilingual format. It is developed by funstory-ai team , hosted on GitHub , mainly for users who need to deal with foreign language documents , such as researchers , students and technicians.BabelDOC support will ...
General Introduction Text2Voice is an open source tool that provides text-to-speech functionality based on a silicon-based mobility API, and is best characterized as coming with a clean graphical user interface (GUI). It was created by developer Sheldon Lee on GitHub to allow users to easily turn text into speech through an interface. The item...
General Introduction FreeAI is an open source AI application platform based on the Pollinations.AI API, providing free and unlimited AI chat assistants, image generation and speech synthesis services. Created by developer Azad-sl on GitHub, the project's core feature is the use of pure HTML files to develop...
General Introduction Open WebUI Artifacts Overhaul is a fork project based on Open WebUI, developed by developer Nick Tonjum. It is an open source tool focused on improving the functionality of AI for code generation and presentation. It allows users to have AI generate code and interface directly...
General Introduction OpenAvatarChat is an open source project developed by the HumanAIGC-Engineering team and hosted on GitHub. It is a modular digital human conversation tool that allows users to run full functionality on a single PC. The project combines real-time video, speech recognition and digital human technology...
General Introduction uniOCR is an open source text recognition tool developed by mediar-ai team. It is based on the Rust language and supports macOS, Windows and Linux systems. It supports macOS, Windows and Linux systems. users can use it to extract text from images, the operation is simple and free. uniOCR's core feature is cross-platform support...
General Introduction Serena is a free and open source programming tool developed by the Oraios AI team and hosted on GitHub. It is a powerful code assistant that works directly in your codebase to help developers analyze, edit, and execute code.Serena is implemented through the Language Server Protocol (LSP)...
General Introduction AudioX is an open source project by Zeyue Tian et al. on GitHub, with an official paper published on arXiv (No. 2503.10522). It is based on the diffusion transformer (Diffusion Transformer) technology , from text, video, images, audio and other input to generate high-quality ...
General Introduction EasyControl is an open source project, the project is based on the Diffusion Transformer (DiT) architecture to provide efficient and flexible image generation control. Among them, Ghibli Control LoRA is one of its featured functions, by using only 100 Asian faces and their GPT-4o generated Ghibli style images...
YOLOE is an open source project developed by the Multimedia Intelligence Group (THU-MIG) at Tsinghua University School of Software, with the full name "You Only Look Once Eye". It is based on the PyTorch framework, and is an extension of the YOLO series, which can detect and segment any object in real time. The project is hosted on GitHub, ...
General Introduction Open-VoiceCanvas is an open source speech synthesis platform developed by the ItusiAI team. It supports more than 50 languages, can turn text into natural speech, and can also clone personalized voices by uploading audio. The project integrates OpenAI TTS, AWS Polly and MiniMax three...