General Introduction Rabbit Android Agent is an innovative AI intelligence developed by Rabbit, designed to help users complete single or multi-step tasks on their Android devices through voice and text commands. The technology is based on Rabbit's LAM (Large Action Model)...
General Introduction Convergence is a company dedicated to helping people regain control of their time using machine learning technologies. By developing large-scale meta-learning models (LMLMs), Convergence's AI agents (browser agents) are able to acquire new skills, take action, and continuously improve in real-time use. Its core ...
Enable Builder Smart Programming Mode, unlimited use of DeepSeek-R1 and DeepSeek-V3, smoother experience than the overseas version. Just enter the Chinese commands, even a novice programmer can write his own apps with zero threshold.
General Introduction mac assistant is an AI intelligences project designed specifically for macOS, aiming to simplify user operations by combining native software and web features. The project currently supports the OpenAI and GEMINI APIs, and plans to support a native large language model run by Ollama in the future. mac_assista...
General Introduction Open Operator is an open source project that aims to automate operations in the browser through AI intelligences. Developed by Browserbase, the project combines the technologies of Stagehand and Browserbase to enable users to control the behavior of the browser through natural language commands.Ope...
General Introduction MobileAgent is a powerful mobile device operation assistant designed to improve the efficiency and automation of mobile device operation through multi-agent collaboration and enhanced visual perception modules. Developed by the X-PLUG team, it supports Android and Harmony OS systems, and is capable of working in complex...
General Introduction TankWork is an open source desktop agent framework designed to enable AI to perceive and control your computer through computer vision and system-level interaction. The framework allows agents to directly control computers through voice and text commands, process real-time screen content, and provide continuous audio visual feedback and manipulation...
General Introduction UI-TARS Desktop is a graphical interface agent application based on UI-TARS (Visual Language Model) developed by ByteDance. The application allows users to control computers through natural language for more intuitive and efficient human-computer interaction.UI-TARS Desktop supports cross-platform operation, both...
General Introduction Shortest is an AI-powered natural language end-to-end testing framework developed by the Anti-Work team. It is built on Playwright and supports GitHub integration and two-factor authentication (2FA).Shortest's main feature is to write test cases through natural language and utilize Anthropic Cl...
General Introduction Midscene.js is an AI-powered browser automation tool that controls web pages, performs assertions and extracts data through natural language commands. It supports Chrome extensions, JavaScript SDKs and YAML scripts, simplifying the process of writing and maintaining UI tests. By utilizing multimodal large ...
General Introduction Stagehand is an AI web browsing framework focused on simplicity and extensibility. It is fully Playwright-compatible and provides three simple AI APIs (act, extract, and observe) that are built on top of the underlying Playwright Page classes for web through natural language...
General Introduction Eko is a production-grade JavaScript framework designed to build efficient intelligent agent workflows through natural language descriptions. It is designed to enable developers to automate everyday tasks using AI technologies without deep programming.Eko provides a unified interface that supports the use of AI in counting...
General Description AutoMouser is a Chrome extension that intelligently tracks user interactions and automatically generates Selenium test code using OpenAI's GPT model. It does this by recording user browser actions and converting them into robust, maintainable Python Selenium scripts,...
Comprehensive Introduction Browser Use Web UI is an innovative open source project focused on providing AI agents with a graphical interface tool for browser interaction capabilities. The project is built on top of the browser-use core framework , through Gradio to build a user-friendly Web interface , making it easy for AI agents to ...
General Introduction E2B Open Computer Use is an open source project that aims to provide a secure cloud-based Linux computer use experience through the E2B Desktop Sandbox.The E2B Sandbox provides a desktop graphical environment that users can connect to any Large Language Model (LLM) to control their computers, supporting...
General Introduction NeoAI is an innovative open source AI assistant tool that allows users to easily control and manage their computers through natural language conversations. Without writing any code, users can just use daily conversations to find files, automate tasks, manage devices, etc. NeoAI supports Window...
Comprehensive Introduction CogAgent is an open source visual language model developed by Tsinghua University Data Mining Research Group (THUDM), aiming to automate cross-platform graphical user interface (GUI) operations. The model is based on CogVLM (GLM-4V-9B), supports bilingual interactions in English and Chinese, and is able to automate GUI operations through screenshots and natural...
General Introduction ClickClickClick is a framework developed by BandarLabs that aims to automate Android and PC operations by using any local or remote Large Language Model (LLM). The project is currently in a highly experimental phase and supports a variety of models such as Ollama, Gemini and GPT 4o. using...
Comprehensive Introduction Browser-Use is an innovative open source web automation tool specifically designed to enable Language Models (LLMs) to naturally interact with websites. It provides a powerful and flexible framework that supports a wide range of mainstream language models, including GPT-4, Claude, and others. The tool's most notable feature...
General Introduction Project Mariner is a research prototype launched by Google DeepMind to explore the future of human-computer interaction. The project leverages the powerful multimodal understanding and reasoning capabilities of Gemini 2.0 to accomplish a variety of tasks through browser automation.Project Mariner is able to reason...
Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.