AI Personal Learning
and practical guidance

TankWork: an intelligent body that operates computers via voice and text and provides real-time voice feedback

General Introduction

TankWork is an open source desktop agent framework designed to enable AI to perceive and control your computer through computer vision and system-level interaction. The framework allows agents to directly control computers through voice and text commands, process real-time screen content, and provide continuous audio-visual feedback and action logs.TankWork is particularly well suited for developers and researchers to help them create autonomous desktop agents capable of truly understanding, analyzing, and interacting with computer interfaces.

TankWork: an intelligent body that operates a computer via voice and text and provides real-time voice feedback-1


 

Function List

  • Direct computer control: Execute operations via voice and text commands
  • computer vision analysis: Real-time screen content processing
  • voice interaction: Natural Language Processing with ElevenLabs
  • Customizable agents: Configuring personalities and skills
  • Real-time feedback: Audio visual updates and logging

 

Using Help

Installation process

  1. Installation prerequisites::
    • Install Anaconda (recommended for dependency management)
    • Accessing a terminal/command prompt
  2. clone warehouse::
   git clone https://github.com/AgentTankOS/tankwork.git
cd tankwork
  1. Installation of dependencies::
   pip install --upgrade pip setuptools wheel
pip install -r requirements.txt
  1. Configuration environment::
    • In the project root directory, create the.envDocumentation:
     cp .env.example .env
    
    • Add the API key and settings to the.envDocumentation:
     GEMINI_API_KEY=your_api_key
    OPENAI_API_KEY=your_api_key
    ELEVENLABS_API_KEY=your_api_key
    ANTHROPIC_API_KEY=your_api_key
    ELEVENLABS_MODEL=eleven_flash_v2_5
    COMPUTER_USE_IMPLEMENTATION=tank
    COMPUTER_USE_MODEL=claude-3-5-sonnet-20241022
    COMPUTER_USE_MODEL_PROVIDER=anthropic
    NARRATIVE_LOGGER_NAME=ComputerUse.Tank
    NARRATIVE_MODEL=gpt-4o
    NARRATIVE_TEMPERATURE=0.6
    NARRATIVE_MAX_TOKENS=250
    LOG_LEVEL=INFO
    
  2. launch an application::
   python main.py

Usage Process

  1. PC control mode::
    • Command-based computer control via text input or voice commands.
    • For example, you can say "open browser" or type "open browser" to start the browser.
  2. computer vision analysis::
    • Processes screen content in real time, recognizing and responding to changes on the screen.
    • For example, the agent can automatically perform a preset action when a specific image appears on the screen.
  3. voice interaction::
    • Use ElevenLabs' natural language processing capabilities to interact with agents via voice.
    • For example, you can ask the agent about the current weather conditions and the agent will reply by voice.
  4. Customized Agents::
    • Configure the agent's personality and skills to meet specific needs.
    • For example, you can set the agent to perform a specific task at a specific time, such as opening the mail client at 8:00 a.m. every day.
  5. Real-time feedback::
    • The agent will provide real-time updates and operation logs, both audio and visual, to help the user understand the current operation status.
    • For example, when the agent executes a command, it informs the user of the result of the operation by voice.

With these steps, you can easily install and use TankWork to take full advantage of its powerful features to control and manage your computer.

May not be reproduced without permission:Chief AI Sharing Circle " TankWork: an intelligent body that operates computers via voice and text and provides real-time voice feedback

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish