AI Personal Learning
and practical guidance
讯飞绘镜

Midscene.js: Open Source Plugin for Automated Browser Testing Driven by AI

General Introduction

Midscene.js is an AI-powered browser automation tool capable of controlling web pages, executing assertions and extracting data through natural language commands. It supports Chrome extensions, JavaScript SDKs, and YAML scripts, simplifying the process of writing and maintaining UI tests. By leveraging multimodal big language models such as GPT-4o, Midscene.js provides a new automated development experience that allows users to intuitively interact with web pages and fetch structured JSON data.

Byte open source Midscene.js, natural language + interface screenshots directly generate E2E tests, saving the team countless hours of repetitive labor, and the current Coding + multimodal capabilities to solve many basic E2E problems has been very perfect.

Midscene.js:用AI驱动浏览器自动化测试的开源插件-1

 


 

Function List

  • natural language interaction: Using natural language to describe the steps, AI automatically plans and controls the user interface.
  • JSON Data Extraction: Automatically generate response data in JSON format according to user requirements.
  • intuitive assertion: Assertions are made in natural language, which the AI understands and executes.
  • Chrome Extension Experience: No need to write code to start the experience with extensions.
  • Visualization reports: Provide detailed implementation reports to help users understand and debug the process.
  • Support for multiple scripts: Includes JavaScript and YAML, providing flexible automated scripting.

 

Using Help

Installation and Configuration

Install the Chrome extension:

  1. Visit the Chrome Online App Store and search for "Midscene".
  2. Click the "Add to Chrome" button.
  3. Confirm the installation and allow permissions.

Configure environment variables (for SDK use):

  • For OpenAI API usage, you will need to create an.envfile, add the following:
export OPENAI_API_KEY="你的API密钥"
export MIDSCENE_MODEL_NAME="gpt-4o"
  • If you are using another model service, you need to adjust the above environment variables accordingly.

Usage Process

Used via Chrome extension

  • Launch extensions: After installation, the extension icon will be displayed in the browser toolbar. Click on the icon to open the Midscene control panel.
  • interactive operation: Enter natural language commands in the control panel, such as "Click on the login button" or "Extract all headings from a web page".
  • View Results: After the operation is complete, the extension returns the results of the execution, usually presenting the extracted data in JSON format.

Used via JavaScript SDK

  • Introducing the SDK::
    import { ai, aiQuery, aiAssert } from'@midscene/web';
    
  • executable operation::
    • basic operation: Useaifunction performs simple web page operations. Example:
      await ai('在搜索框中输入 "React"');
      
    • data extraction: UseaiQueryto extract the data:
      const data = await aiQuery('{title: string, price: number}[]', '找到产品列表并提取标题和价格');
      
    • assertion checking: UtilizationaiAssertMake assertions:
      await aiAssert('页面上应该有登录按钮');
      

Using YAML Scripts

  • Writing YAML scripts: Define your automation tasks in a **.yaml** file, for example:
    -action:type
    selector:'input[name="search"]'
    value:'JavaScript'
    -action:click
    selector:'button[type="submit"]'
    
  • executable script: Run these scripts via command line tools or Midscene's CLI.

Operational details

  • natural language instruction: Instructions can be as simple as "click", "enter" or as complex as "find all products labeled 'Sale' and record the price! ".
  • error handling: If the operation fails, Midscene provides a detailed report indicating the reason for the failure and helps you adjust the command.
  • Debug and Playback: The execution of each test or operation can be played back with visual reports to help you understand or debug your scripts.

This detailed usage guide ensures that users get up to speed quickly and take full advantage of Midscene.js features for efficient browser automation testing.

May not be reproduced without permission:Chief AI Sharing Circle " Midscene.js: Open Source Plugin for Automated Browser Testing Driven by AI
en_USEnglish