AI Personal Learning
and practical guidance
Beanbag Marscode1

Anubis: Interfering with AI Crawler Crawling by Proof of Workload

General Introduction

Anubis is an open source tool developed by the TecharoHQ team to protect websites from AI crawlers. It blocks non-compliant automated crawlers by adding a SHA256 Proof-of-Work challenge to HTTP requests, requiring visitors to complete computational tasks. Written in Go and open-sourced on GitHub, the tool is suitable for websites that don't want to be indexed by search engines or need to protect their resources. techaroHQ is a Canadian company specializing in anti-AI crawler technology. anubis was inspired by the abusive behavior of AI crawlers on the modern Internet, which ignore robots.txt This is an extreme solution to the problem of crawlers, but one that puts a burden on the servers. Officially, this is a "nuke-level" solution, which is extreme but effective in dealing with the crawler problem.

The program is more efficient and does not rely on Cloudflare Launches AI Maze: Countering Malicious Crawlers with Generative AI , the downside is also obvious, it's not friendly to sites that need SEO.

 

Function List

  • Authenticate HTTP requests with SHA256 proof-of-workload to block access by AI crawlers.
  • Protect upstream server resources from being over-consumed by automated tools.
  • Supports Docker deployment for quick and easy integration into existing systems.
  • Provide health check function to ensure stable service operation.
  • Open source code, users are free to modify and customize.
  • Deliberately blocking some search engines from indexing, for sites that don't want to be public.

 

Using Help

Installation process

Anubis is easy to install and is suitable for users with a technical background. Below are the detailed steps:

1. Pre-conditions

  • Git and Docker need to be installed:
    • Git is used to fetch code.
    • Docker is used to run Anubis.
  • Check the environment:
    git --version
    docker --version

Make sure the command has output.

2. Access to code

  • Runs in the terminal:
    git clone https://github.com/TecharoHQ/anubis.git
    cd anubis
    

3. Build and Run with Docker

  • Build the mirror image:
    docker build -t anubis .
    
  • Run the container:
    docker run -p 8080:8080 anubis
    
  • Default listener 8080 port, accessing the http://localhost:8080 Testing.

4. Configuration (optional)

  • Custom ports or addresses:
    docker run -p 9000:8080 anubis --listen=http://0.0.0.0:9000
    
  • interchangeability 9000 for the port you want.

5. Checking operational status

  • Check the health status:
    docker exec <容器ID> /app/bin/anubis --healthcheck
    
  • A return of Normal indicates that the service is running well.

How to use the main features

At the heart of Anubis is SHA256 proof-of-workload protection. Here are the instructions for doing so:

Workload certification protection

  • When accessed by the user, Anubis requires the client to complete the SHA256 calculation.
  • Normal browsers can do it quickly, AI crawlers are blocked due to high computational costs.
  • No manual operation is required after deployment, the protection takes effect automatically.

test function

  • Official test site anubis.techaro.lol::
    • browser access https://anubis.techaro.lol, you will see the validation process.
    • Tested with a crawler tool:
      curl https://anubis.techaro.lol
      

      It will be prompted that a proof of workload needs to be completed.

View Log

  • Check the run log:
    docker logs <容器ID>
    
  • The log displays the request validation details.

Support and Feedback

  • You can file an issue on GitHub if you encounter problems:
    • Address:https://github.com/TecharoHQ/anubis/issues/newThe
    • A detailed description, including environmental and error messages, is required for submission.
  • For real-time communication, join Patreon's Discord:
    • Address:https://patreon.com/cadeyThe
    • exist #anubis Channel Question.

caveat

  • Anubis prevents some search engines (such as Google) from indexing websites. This is a deliberate official feature, not a flaw.
  • If SEO is required, the official recommendation is to use Cloudflare instead.
  • Anubis is suitable for scenarios where Cloudflare is not available or strong protection is required.

 

application scenario

  1. Protection of personal websites
    • You have a blog and you don't want AI crawlers crawling your content, Anubis can effectively block it.
  2. Hosting private resources
    • When sharing files to specific users, use Anubis to ensure that only authenticated people can access them.
  3. development and testing environment
    • During the development phase, prevent crawlers from interfering and protect server resources with Anubis.

 

QA

  1. Does Anubis affect normal users?
    • It won't. The computational task is light and almost senseless for browser users.
  2. Is it suitable for a production environment?
    • Docker is simple to deploy and has been tested to be stable.
  3. Why can't search engines index it?
    • The workload proves to block crawlers, including search engines. This was the design goal of Anubis.
  4. What if I don't use Anubis?
    • It is possible to protect a website with Cloudflare, which is suitable for most situations.
May not be reproduced without permission:Chief AI Sharing Circle " Anubis: Interfering with AI Crawler Crawling by Proof of Workload
en_USEnglish