General Introduction UI-TARS Desktop is a graphical interface agent application based on UI-TARS (Visual Language Model) developed by ByteDance. The application allows users to control computers through natural language for more intuitive and efficient human-computer interaction.UI-TARS Desktop supports cross-platform operation, both...
General Introduction Devin Cursor Rules is an open source project that aims to enhance the Cursor and Windsurf integrated development environments (IDEs) with configuration files and tools to enable advanced AI capabilities similar to Devin. The project provides process planning, self-evolution, extended tool usage (e.g., web browsing...
Enable Builder Smart Programming Mode, unlimited use of DeepSeek-R1 and DeepSeek-V3, smoother experience than the overseas version. Just enter the Chinese commands, even a novice programmer can write his own apps with zero threshold.
General Introduction Repomix (formerly known as Repopack) is an open source tool designed to package an entire codebase into a single, AI-friendly file. This tool makes it easy for developers to make their codebase available to large language models (such as Claude, ChatGPT, and Gemini) for analysis and processing...
General Introduction Yek is a fast Rust-based tool for reading text files from repositories or directories, chunking them, and serializing them for use in Large Language Models (LLMs). The tool uses the .gitignore rule by default to skip unwanted files and uses Git history to infer important files....
Comprehensive Introduction Kheish is an open source multi-role agent designed for Large Language Model (LLM) tasks that require structured, step-by-step collaboration.Kheish is more than just a simple coordinator, it is an intelligent agent in its own right, requesting modules on demand, integrating user feedback across different...
General Introduction AI ContentCraft is a versatile content creation tool that integrates text generation, speech synthesis, image generation and more. It helps creators quickly generate stories, podcast scripts, and accompanying audio and video content. The tool supports multiple language conversions, can batch process content, and is extremely...
General Introduction Unigraph is a local-first universal knowledge graph and personal search engine designed to provide users with an integrated workspace to help manage and search for a wide variety of data in their personal lives. With Unigraph, users can integrate data from different sources into a unified knowledge graph...
General Introduction ComfyUI-disty-Flow is a custom node that provides a user-friendly interface for ComfyUI. It is intended to simplify the running of workflows by providing alternative user interfaces, rather than replacing the creation of workflows.ComfyUI-disty-Flow is currently in the early stages of development, so...
General Introduction Shortest is an AI-powered natural language end-to-end testing framework developed by the Anti-Work team. It is built on Playwright and supports GitHub integration and two-factor authentication (2FA).Shortest's main feature is to write test cases through natural language and utilize Anthropic Cl...
General Introduction Midscene.js is an AI-powered browser automation tool that controls web pages, performs assertions and extracts data through natural language commands. It supports Chrome extensions, JavaScript SDKs and YAML scripts, simplifying the process of writing and maintaining UI tests. By utilizing multimodal large ...
Comprehensive Introduction Video Analyzer is a comprehensive video analysis tool that combines computer vision, audio transcription, and natural language processing techniques to generate detailed video content descriptions. The tool does this by extracting key frames from the video, transcribing audio content, and generating natural language...
General Introduction Unsloth is an open source project designed to provide efficient tools for fine-tuning and training large language models (LLMs). The project supports a wide range of well-known models, including Llama, Mistral, Phi, and Gemma, etc. Unsloth's main features are the ability to significantly reduce memory usage and speed up training...
Comprehensive Introduction MaxKB (Max Knowledge Base) is an open source knowledge base Q&A system based on large language modeling and RAG (Retrieval Augmented Generation). The system is widely used in intelligent customer service, enterprise internal knowledge base, academic research and education and other scenarios.MaxKB supports direct upload documents or automatically crawl in...
Comprehensive Introduction OmniThink is an innovative machine writing framework designed to generate high-quality, long-form articles by mimicking the iterative expansion and reflection of human cognitive processes. The framework focuses on extending the boundaries of knowledge and generating information that is rich and deep.OmniThink generates articles by building outlines and...
General Introduction OpenAI Realtime Agents is an open source project that aims to show how OpenAI's real-time API can be utilized to build multi-intelligent body speech applications. It provides a high-level intelligent body model (borrowed from OpenAI Swarm) that allows developers to build complex multi-intelligent body speech systems in a short time...
General Introduction DeepFace is a lightweight Python library for facial recognition and facial attribute analysis (including age, gender, emotion and ethnicity). It integrates several advanced facial recognition models such as VGG-Face, FaceNet, OpenFace, DeepFace, DeepID, ArcFace, Dlib, SFace...
Comprehensive Introduction SynthLight is a portrait relighting tool based on a diffusion model. It learns to re-render synthetic face images to achieve lighting effect adjustments to real portrait photos. The tool uses a physical rendering engine to generate datasets that simulate lighting transformations under different lighting conditions.SynthLigh...
General Introduction 1-2-1-MNVTON is a GitHub-based open source project that aims to achieve efficient virtual try-on through the "Modality-specific Normalization for Virtual Try-On" (MNVTON) technology. The project solves the problem of high computational cost in traditional virtual try-on techniques by providing ...
General Introduction Kokoro-ONNX is an open source text-to-speech (TTS) tool based on ONNX runtime. Developed by thewh1teagle, the project aims to provide efficient and fast speech synthesis solutions.Kokoro-ONNX supports multiple languages, including English, and plans to support French, Japanese, Korean...