AI Personal Learning
and practical guidance
CyberKnife Drawing Mirror

OWL: An automated tool for multi-intelligence collaboration on realistic tasks

General Introduction

OWL (Optimized Workforce Learning) is an open source framework developed by the CAMEL-AI team focused on optimizing multi-intelligent body collaboration for automating real-world tasks. Based on the CAMEL-AI architecture, OWL improves the naturalness, efficiency and robustness of task processing through dynamic intelligent body interactions. In the GAIA benchmark test, OWL achieved an average score of 58.18, ranking first in open source frameworks. The project is officially open-sourced on March 7, 2025, and the code is hosted on GitHub (https://github.com/camel-ai/owl), which provides detailed documentation and examples, aiming to promote the integration of AI research and real-world applications for both academic exploration and task automation scenarios.

The saddest thing about the Chinese-speaking community is that as a source of information, they never introduce CAMEL-AI and the AGENTGPT Instead, they are interested in something like Manus The OWL is very interesting. Some products commercialization will promote technological advancement, some will not.

OWL: Multi-Intelligence Collaboration for Realistic Task Automation Tool-1

 

Function List

  • Real-time information retrieval: Supports access to up-to-date information via Wikipedia, Google Search, and other online resources.
  • multimodal processing: Ability to process video, picture and audio data over the network or locally.
  • Browser Automation: Based on the Playwright framework, it supports simulation of browser actions such as scrolling, clicking, typing and downloading.
  • document resolution: Extract Word, Excel, PDF and PowerPoint file contents and convert to text or Markdown format.
  • code execution: Support for writing and running Python code to accomplish tasks through the interpreter.
  • Multi-Intelligence Collaboration: Multiple AI intelligences interact dynamically to collaborate on complex tasks.

 

Using Help

Installation process

OWL is an open source project, users need to download the source code from GitHub and configure the runtime environment. The following are the detailed installation steps:

  1. clone warehouse
    Enter the following command in the terminal to get the OWL source code:
git clone https://github.com/camel-ai/owl.git
cd owl
  1. Setting up the environment
  • Recommended Conda::
    conda create -n owl python=3.11
    conda activate owl
    
  • Alternatives use venv::
    python -m venv owl_env
    
    • Windows system activation:
      owl_env\Scripts\activate
      
    • Unix or MacOS system activation:
      source owl_env/bin/activate
      
  1. Installation of dependencies
    After activating the environment, run the following command to install the dependencies:
python -m pip install -r requirements.txt
playwright install

Notes:playwright installUsed to install components required for browser automation.

  1. Configuring Environment Variables
    OWL needs to configure API keys to use external services (e.g. OpenAI models). The steps to do this are as follows:
  • Copy the template file:
    cp .env_template .env
    
  • compiler.envfile, fill in the API key, for example:
    OPENAI_API_KEY=your_openai_key
    
  • Guidelines for obtaining the key: refer toowl/.env_templateThe service registration URL listed in the
  • Additional model support: available in the CAMEL model documentation (https://docs.camel-ai.org/key_modules/models.html).
    take note of: It is officially recommended to use OpenAI models for best performance, other models may perform poorly in complex tasks.
  1. Verify Installation
    Run the following command to test the environment:
python owl/run.py

If the console outputs a normal message, the installation was successful.

Main function operation flow

1. Examples of operating bases

OWL provides a minimalist example scriptrun.py, run it directly to experience it:

  • Enter it in the terminal:
python owl/run.py
  • Output: The console will display the results of running the default task.

2. Customized mandates

Users can modify therun.pyScripts to run customized tasks:

  • Editing Scripts: Openrun.py, modify the task description, for example:
question = "Check the latest stock price of Apple Inc."
society = construct_society(question)
answer, chat_history, token_count = run_society(society)
logger.success(f "Answer: {answer}")
  • Running Scripts::
    python owl/run.py
    
  • Results View: The console will output stock price information.
  • Other sample tasks::
    • "Analyzing the Sentiment of Recent Tweets on Climate Change."
    • "Help me debug this Python code: [code content]"
    • "Summarize the main points of this research paper:[Paper URL]."

3. Browser automation

OWL supports browser interaction via Playwright, such as crawling web pages:

  • Sample Script: Create a file (e.g.web_task.py):
    from owl.import BrowserAgent
    agent = BrowserAgent()
    agent.navigate("https://example.com")
    content = agent.get_content()
    print(content)
    
  • Running Scripts::
    python web_task.py
    
  • in the end: Outputs the text content of a web page.
  • Supported Operations: scrolling, clicking, typing, downloading, etc. Refer to the official documentation for specific APIs.

4. Document parsing and multimodal processing

  • parse a document: Place a local file (e.g.sample.pdf(computing) put (sth) into (the)owldirectory, run the following code:
    from owl.utils import parse_document
    text = parse_document("sample.pdf")
    print(text)
    
  • Processing Video: Support for analyzing local or network video, for example:
    from owl.multimodal import process_video
    result = process_video("https://example.com/video.mp4")
    print(result)
    

Featured Function Operation

Real-time information retrieval

  • procedure: Specify the source of information in the task description, for example:
    question = "Get the latest definition of artificial intelligence from Wikipedia."
    society = construct_society(question)
    answer, chat_history, token_count = run_society(society)
    print(answer)
    
  • in the end: Return to the latest content on Wikipedia.

GAIA Benchmark Replication

  • operational test: Reproduce the GAIA results using the provided script:
    python run_gaia_roleplaying.py
    
  • Results View: Output the scores for each task to verify the performance of OWL in the benchmark test (mean score 58.18).

Precautions for use

  • The system needs to have Git and Python 3.11+ installed.
  • When running large-scale tasks, it is recommended to use high-performance equipment and ensure network stability.
  • If the Chrome window is blank but there is output from the console, this is normal, and the window is only activated when the task requires browser interaction.

CDN1
May not be reproduced without permission:Chief AI Sharing Circle " OWL: An automated tool for multi-intelligence collaboration on realistic tasks

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish