AI Personal Learning
and practical guidance
Beanbag Marscode1

Nanobrowser: Multi-Intelligence Plugin for Task Automation in Browsers

General Introduction

Nanobrowser is an open source Chrome extension designed to automate web tasks through an AI-powered multi-agent system. It is a free alternative to OpenAI Operator, which users can use by simply providing their LLM (Large Language Model) API key, with support for OpenAI and Anthropic models, with more options to be extended in the future. All operations are run in a local browser, with no cloud data sharing involved, ensuring privacy and security.Nanobrowser handles tasks ranging from simple searches to complex processes through the collaboration of three agents: Planner, Navigator, and Validator. The project code is hosted on GitHub with an active community where users can participate in discussions and contribute via Discord or X.

Nanobrowser: Multi-Intelligence Plug-in for Automating Web Tasks in Browsers-1


 

Function List

  • multi-agent system: Planner develops strategies, Navigator performs operations, and Validator verifies results, collaborating to accomplish complex tasks.
  • Flexible LLM support: Support for OpenAI and Anthropic allows users to choose different models for different agents.
  • local operation:: Data processing is done locally to protect user privacy.
  • Task automation:: Perform web searches, form filling, data extraction, and other operations.
  • Interactive Sidebar:: Provide a chat interface with real-time status updates.
  • Dialog History:: Keeping records of tasks to support subsequent viewing and management.
  • open source and transparent: The code is open for review and improvement.
  • Follow-up questions:: Support for contextualized questioning based on task results.

 

Using Help

Installation process

Nanobrowser is available as a Chrome extension that offers two installation options: downloading a pre-built version directly or building from source.

Method 1: Direct installation of pre-built version

  1. Download Extensions
    • interviews https://github.com/nanobrowser/nanobrowser/releasesThe
    • Find the latest version (e.g. v1.0.0) on the Releases page.
    • Download the file called "nanobrowser.zip".
  2. Unzip the file
    • Extract "nanobrowser.zip" to a local folder (e.g. "nanobrowser" folder).
  3. Load to Chrome
    • Open Chrome and typechrome://extensions/The
    • Enable "Developer Mode" in the upper right corner.
    • Click on "Load unpacked" in the upper left corner.
    • Select the unzipped "nanobrowser" folder and click "Select Folder".
    • After successful installation, the Nanobrowser icon appears in the Chrome toolbar.
  4. Configuring the API Key
    • Click the Nanobrowser icon in the toolbar to open the sidebar.
    • Click on the Settings icon in the upper right corner.
    • Enter your LLM API key (available on the OpenAI or Anthropic websites).
    • Select models for Planner, Navigator, Validator (e.g., OpenAI's GPT-4o or Anthropic's Claude).
    • Save the settings to complete the configuration.

Method 2: Build from Source

  1. Preparing the environment
    • mounting Node.js(v22.12.0 or later).
    • mounting pnpm(v9.15.1 or later).
  2. clone warehouse
    • Open a terminal and enter the following command:
      git clone https://github.com/nanobrowser/nanobrowser.git
      cd nanobrowser
      
  3. Installation of dependencies
    • Input:
      pnpm install
      
  4. Building extensions
    • Input:
      pnpm build
      
    • When the build is complete, the "dist" folder will contain the extension files.
  5. Load to Chrome
    • Follow step 3 in "Method 1" to load the "dist" folder.
  6. Development mode (optional)
    • If real-time debugging is required, run:
      pnpm dev
      

How to use the main features

1. Mandate automation

  • workflow:
    • Click the Nanobrowser icon in the toolbar to open the sidebar.
    • Enter a task command in the input box, such as "Go to TechCrunch and extract the top 10 headlines from the last 24 hours."
    • Click "Execute" to start the multi-agent system:
      • Planner: Create a task plan, such as opening TechCrunch and locating the headline area.
      • Navigator:: Perform web navigation and data extraction.
      • Validator:: Compliance of inspection results with requirements.
    • Results are displayed in a sidebar that supports copying or follow-up questions.
  • Usage Scenarios:
    • News Summary: Extracts the latest information from a specific website.
    • Shopping Research:: Search Amazon for "waterproof bluetooth speaker, under $50, with over 10 hours of battery life".
    • Code Research: Find the most popular Python repositories on GitHub.

2. Configuration agent model

  • workflow:
    • Open the sidebar and click on "Settings".
    • Enter the API key and select the model, for example:
      • Planner: OpenAI GPT-4o
      • Navigator. Anthropic Claude 3.5 Sonnet
      • Validator: OpenAI GPT-3.5
    • Click "Save" to test if the connection is successful.
  • draw attention to sth.:
    • Different models are suitable for different tasks and it is recommended to try combinations to improve efficiency.
    • Ensure that the API key is valid to avoid task interruption.

3. Viewing and managing dialog history

  • workflow:
    • Select Conversation History in the sidebar.
    • Displays a list of tasks with times, instructions, and results.
    • Click on a record to view the details, or select "Retry" to run it again.
  • practical skill:
    • Export history as a JSON file for easy backup.
    • Examine logs of failed tasks and optimize instructions or models.

4. Follow-up questions

  • workflow:
    • Once the task is complete, enter a follow-up question in the sidebar, such as "Which of these headlines are AI-related?" .
    • The system answers based on previous results without having to re-execute the complete task.
  • dominance:
    • Improved interaction efficiency and suitability for in-depth analysis.

Featured Function Operation

multi-agent system

  • How to experience:
    • Enter complex commands such as "Find the 5 most popular AI models on HuggingFace and organize them into a list".
    • Planner breaks down the task, Navigator extracts the data, and Validator verifies the accuracy.
    • The results are returned in structured form.
  • dominance:
    • Dynamic Error Correction: Planner adjusts its strategy as it encounters obstacles.
    • Efficient Collaboration: Save time by processing three agents in parallel.

Local operation and privacy protection

  • How to verify:
    • Open Chrome Developer Tools (F12) and switch to the "Network" tab.
    • When executing a task, only LLM API calls are seen, with no other external requests.
  • mileage:
    • User credentials and sensitive data are not uploaded to the cloud, making it safe and secure.

Interactive Sidebar

  • How to use:
    • When the sidebar is opened, the progress of the task is displayed in real time (e.g. "Navigating", "Validating").
    • Support for adjusting commands or stopping tasks midway.
  • specificities:
    • The interface is intuitive and suitable for both novice and professional users.

caveat

  • network requirement: A stable network is required to call the LLM API.
  • Hardware Recommendations:: Runs better on high-performance equipment.
  • Community Support:: Join if you have problems Discord or focus on X Get help.
CDN1
May not be reproduced without permission:Chief AI Sharing Circle " Nanobrowser: Multi-Intelligence Plugin for Task Automation in Browsers

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish