AI Personal Learning
and practical guidance

Stagehand: A Framework for Natural Language Implementation of Browser Automation Operations

General Introduction

Stagehand is an AI web browsing framework focused on simplicity and extensibility. It is fully Playwright-compatible and provides three simple AI APIs (act, extract, and observe) that are built on top of the underlying Playwright Page classes, providing the building blocks for web page automation via natural language.Stagehand makes writing persistent, efficient browser automation code easier, especially for non-technical users, and is less sensitive to small changes in the UI/DOM. Whether it's pulling the top stories of the day on Hacker News or searching for and buying products on Amazon, Stagehand makes it easy. The framework is currently in early release and the development team is actively seeking community feedback.

Stagehand: A Framework for Natural Language Implementation of Browser Automation-1


 

Function List

  • Provides three simple AI APIs: act, extract, and observe
  • Fully compatible with Playwright
  • Support for web automation through natural language
  • Provide debugging tools such as session replay and step-by-step debugging
  • For non-technical users
  • Insensitive to minor UI/DOM changes
  • Supports integration with Browserbase to provide more powerful debugging tools

 

Using Help

Installation process

  1. Clone the Stagehand project:
   git clone https://github.com/browserbase/stagehand.git
cd stagehand
  1. Install the dependencies:
   npm install
npx playwright install
  1. Run the sample script:
   npm run example

Guidelines for use

Create a new project

To create a Stagehand project configured with default settings, you can run the following command:

npx create-browser-app --example quickstart

See the Quick Start Guide for more information.

Add to existing project

You can add Stagehand to an existing Typescript project with the following command:

npm install @browserbasehq/stagehand zod
npx playwright install

Configuring the API Key

Stagehand at its best requires an LLM provider API key and Browserbase credentials. To add these to your project, run:

cp .env.example .env
nano .env # Edit the .env file to add the API key

Main function operation flow

  1. act API: Used to perform actions such as clicking buttons, filling out forms, etc.
   await page.act('Click the login button');
  1. extract API: Used to extract information, such as text, links, etc. from a page.
   const headlines = await page.extract('Extract all news headlines');
  1. observe API: Used to observe page changes, such as waiting for elements to appear, monitoring page loading, etc.
   await page.observe('Waiting for loading to finish');

Debugging Tools

Stagehand's integration with Browserbase provides powerful debugging tools such as session replay and step-by-step debugging. You can enable these tools by following the steps below:

  1. Add the Browserbase API key to your project:
   nano .env # Add BROWSERBASE_API_KEY and BROWSERBASE_PROJECT_ID
  1. Enable session replay:
   await page.enableSessionReplay();
  1. Enable step-by-step debugging:
   await page.enableStepByStepDebugging();

By following these steps, you can fully utilize the power of Stagehand for efficient browser automation.

May not be reproduced without permission:Chief AI Sharing Circle " Stagehand: A Framework for Natural Language Implementation of Browser Automation Operations

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish