AI Personal Learning
and practical guidance

Data crawling is difficult? Automa plugin helps you easily!

Are you experiencing any of these troubles?"Manually copying and pasting data is too time-consuming and inefficient.".;"I want to collect web page data in bulk, but I don't know how to write code.".;"Tried other crawler tools, but they are too complicated and costly to learn.".;"I'm worried that the crawler will be banned from the site and I don't know how to deal with it."The

Don't worry! Today I'm going to teach you how to use Automa This artifact makes data crawling easy and efficient!


 

1. Automa: your no-code data collection assistant

Automa plugin interface overview

Automa is a powerful automation plugin for Chrome. It helps you to"Automate web browsing, batch data collection, export data to various formats, and set up timed tasks.The

Most importantly:"No need to write code at all, just do it through the visual interface!"

 

2. From Beginner to Master: Three Steps to Data Crawling

Step 1: Installation and basic setup

Search for "Automa" in the Chrome store and install it, click the Automa icon in the top right corner of your browser and create a new workflow.

Chrome Store Installation Screen

Automa plugin location

Workflow creation screen

 

Step 2: Design the workflow

Take crawling e-commerce product data as an example."Core Steps"Included:"Setting the start page, adding a looping block to handle paging, extracting product information, and finally exporting data".The

Step 3: Run and Optimize

In order to ensure the stability and efficiency of data collection"You need to set a reasonable delay time to wait for the page to finish loading".. At the same time."Add an error handling mechanism to prevent unintended interruptions."The

 

3. Practical case: small sweet potato hot post data collection

Automa core concept note

Before we get down to the nitty-gritty, let's go over a few core concepts of Automa:

  1. Workflow: A container for the overall task flow.
  2. Block: Each specific functional module
  3. Selector: A tool for positioning web elements.
  4. Variable: Stores temporary data.
  5. Trigger: The condition that initiates a workflow.
  6. Table: A form for collecting and organizing data.

Overview of Workflow Automation Basics

 

Case Study

Let's look at how to use Automa to collect hot notes data using Little Sweet Potato Hot Notes data collection as an example. At its core, it mimics the process of collecting it manually ourselves, and then using Automa to automate it.

Little Red Book Data Collection Process

First, let's see how to use Automa to collect Little Red Book data. The whole process is divided into the following steps.

Create workflows and configure triggers

Create a workflow called "Xiaohongshu Data Collection". In the Trigger, add a parameter named "key_word", which is used to input the keyword to be searched. The default value of this parameter is set to "independent developer".

Trigger Configuration

Open the target page and search

Use the New Tab block to open the Little Red Book home page (https://www.xiaohongshu.com/explore). Then use the Forms block to position the search box.

How to select to elements

  1. Find the following icon in the dashboard sidebar to access the page to select the element

    Get Selector

  2. Select the element on the capture page and click the copy button in the upper right corner

    Copy Selector

  3. Paste the elements selected in the previous step into Automa's Css Selcetor

    Paste Selector

Cyclic data collection

Iterate through the list of notes using the Loop Elements block. We need to get the selector for the list of notes:

  1. On the notes list page, right-click on any of the note covers
  2. Get the selector ".note-item .cover" with the Automa Selector Getting Tool

cyclic configuration

Open the post and get the details

In the loop, we need to click on each note to go to the detail page. The following points need to be noted here.

  1. "Wait for page to load."Use the Wait Element block to ensure that the page loads completely.
  2. "Click on the cover of the note."Using the Click Element block, click on each note cover.
  3. "Wait for details page to load"Use the Wait Element block to ensure that the details page is fully loaded.

    Open Element Schematic

The data selector acquisition method collected in each loop:

  1. KOL Name: right click on author name > check > copy selector "a.name"
  2. Note title: selector "div#detail-title"
  3. Note content: selector "#detail-desc > .note-text > span"
  4. Interactive data.
    • Likes: ".left > .like-wrapper > .count"
    • Favorites: "#note-page-collect-board-guide > .count"
    • Number of comments: ".chat-wrapper > .count"

Selector Example

Export data

Finally, use the Export Data block to export the collected data in CSV format.

tip

  • If the selector is not accurate, try using XPath
  • Add an appropriate delay to wait for the page to load
  • Regularly check for selector failure
  • It is recommended that no more than 20 pieces of data be collected at a time
  • Control the frequency of collection, do not collect frequently

The whole workflow can stably complete the data collection task through reasonable delay control and selector positioning. At the same time, through the parameterized configuration, it is convenient to adjust the acquisition keywords according to different needs.

4. Frequently asked questions and solutions

Dynamic Selector Explained

We often need to use dynamic selectors when collecting multiple similar elements. Let's learn it through a practical example.

Take this selector as an example.

!!! .note-item:nth-child({{loopData.loopId.$index+1}}) .cover

This selector looks complicated, so let's break it down step by step.

!! The prefix is Automa's special syntax for using JavaScript selectors instead of CSS selectors, allowing us to use more flexible selection methods.

.note-itemSelect the element with class "note-item", which is usually the container for each post in the list.

:nth-child()is a CSS sub-element selector, used to select sub-elements at a specific location, with numbers or expressions inside the parentheses.

{{loopData.loopId.$index+1}}hit the nail on the head{{}}is Automa's variable syntax, andloopData.loopId.$indexis the current index in the loop (starting from 0), the+1it's because:nth-childStart counting from 1.

.coverSelect the final target element, in this case the cover image of the post.

Configure the loop block like this.

{
  selector: "!!! .note-item:nth-child({{loopData.loopId.$index+1}}) .cover",
  timeout: 5000
}

Why is it written this way? Because it enables dynamic positioning: the

  • 1st cycle. .note-item:nth-child(1) .cover
  • 2nd cycle. .note-item:nth-child(2) .cover
  • 3rd cycle. .note-item:nth-child(3) .cover
  • and so on ...

This avoids the problem of fixed selectors: the

/* Wrong way to write it */
.note-item .cover // will select all cover elements

/* Correct syntax */
!!! .note-item:nth-child({{loopData.loopId.$index+1}}) .cover // Precisely selects the element of the current loop

If you are not sure if the selector is correct, you can test it in the browser console at.

// Assuming this is the 3rd loop
document.querySelector('.note-item:nth-child(3) .cover')

Automa's logging feature can also be used: the

{
  type: "log",
  message: "Current selector: .note-item:nth-child({{loopData.loopId.$index+1}}) .cover"
}

Through this dynamic selector approach, we can accurately locate the target element in each loop to avoid selecting the wrong element and improve the stability and accuracy of the workflow. Selector writing is one of the most critical parts of data collection, the reasonable use of dynamic selectors can make your workflow more robust and reliable.

May not be reproduced without permission:Chief AI Sharing Circle " Data crawling is difficult? Automa plugin helps you easily!

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish