AI Personal Learning
and practical guidance

EXO: Running distributed AI clusters using idle home devices with support for multiple inference engines and automated device discovery.

General Introduction

Exo is an open source project designed to run its own AI cluster using everyday devices (e.g. iPhone, iPad, Android, Mac, Linux, etc.). Through dynamic model partitioning and automated device discovery, Exo is able to unify multiple devices into a single powerful GPU that supports multiple models such as LLaMA, Mistral, LlaVA, Qwen, and Deepseek.Exo also provides a ChatGPT-compatible API that allows users to easily run models on their own hardware.

EXO: Running distributed AI clusters using idle home computers with support for multiple inference engines and automated device discovery. -1


 

Function List

  • Broad model support: Support for multiple models such as LLaMA, Mistral, LlaVA, Qwen and Deepseek.
  • Dynamic model partitioning: Optimize model partitioning based on current network topology and device resources.
  • Automated device discovery: Autodiscover other devices without manual configuration.
  • ChatGPT Compatible API: Provide a ChatGPT-compatible API to facilitate running the model on your own hardware.
  • equipment equality: The devices are connected to each other using point-to-point connections and do not use a master-slave architecture.
  • Multiple partitioning strategies: Supports multiple partitioning strategies such as ring memory-weighted partitioning.

 

Using Help

Using Help

Installation process

  1. preliminary::
    • Make sure Python version >= 3.12.0.
    • If using Linux and supporting NVIDIA GPUs, install the NVIDIA drivers, CUDA toolkit, and cuDNN library.
  2. Installation from source::
    • Cloning Project:git clone https://github.com/exo-explore/exo.git
    • Go to the project catalog:cd exo
    • Install the dependencies:pip install -e .
    • Or use a virtual environment to install it:source install.sh

Functional operation flow

  1. operational model::
    • Run the example on multiple macOS devices:
      • Equipment 1:exo
      • Equipment 2:exo
      • Exo automatically discovers other devices and launches a ChatGPT-like WebUI (powered by tinygrad tinychat) athttp://localhost:52415The
    • Run the example on a single device:
      • Use the command:exo run llama-3.2-3b
      • Use a customized prompt:exo run llama-3.2-3b --prompt "What is the meaning of exo?"
  2. Model Storage::
    • By default, models are stored in the~/.cache/huggingface/hubThe
    • This can be done by setting the environment variableHF_HOMEto change the model storage location.
  3. adjust components during testing::
    • Using Environment VariablesDEBUG(0-9) Enable debug logging:DEBUG=9 exo
    • For the tinygrad inference engine, use a separate debugging flagTINYGRAD_DEBUG(1-6):TINYGRAD_DEBUG=2 exo
  4. Formatting Code::
    • utilizationyapfFormatting code:
      • Installation formatting requirements:pip3 install -e '. [formatting]'
      • Run the formatting script:python3 format.py . /exo

Usage

  1. Start EXO::
   exo

EXO will automatically discover and connect to other devices without additional configuration.

  1. operational model::
    • Use the default model:
     exo run llama-3.2-3b
    
    • Customization Tip:
     exo run llama-3.2-3b --prompt "What is the meaning of EXO?"
    
  2. API Usage Examples::
    • Send request: bash
      curl http://localhost:52415/v1/chat/completions \
      -H "Content-Type: application/json" \
      -d '{
      "model": "llama-3.2-3b".
      "messages": [{"role": "user", "content": "What is the meaning of EXO?"}] ,
      "temperature": 0.7
      }'

performance optimization

  • macOS users::
    • Upgrade to the latest version of macOS.
    • (of a computer) run. /configure_mlx.shOptimize GPU memory allocation.

common problems

  • SSL error: On some MacOS/Python versions, the certificate is not installed correctly. Run the following command to fix it:
  /Applications/Python 3.x/Install Certificates.command
  • Debug Log: Enable debug logging:
  DEBUG=9 exo
May not be reproduced without permission:Chief AI Sharing Circle " EXO: Running distributed AI clusters using idle home devices with support for multiple inference engines and automated device discovery.

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish