AI Personal Learning
and practical guidance
豆包Marscode1

LiteLLM: Python SDK for Unified Calling of Multiple Big Model APIs, Multi-platform LLM Calling and Management Tools

 

General Introduction

LiteLLM is a Python SDK and proxy server developed by BerriAI to simplify and unify the invocation and management of multiple Large Language Model (LLM) APIs. It supports more than 100 large model APIs, including OpenAI, HuggingFace, Azure, etc., and unifies them into OpenAI format, which makes it easy for developers to switch and manage between different AI services. It also provides a stable Docker image and detailed migration guide.LiteLLM allows users to call more than 100 LLM APIs in OpenAI format via proxy server and Python SDK, which greatly improves development efficiency and flexibility.

LiteLLM:统一调用多种大模型API的Python SDK,多平台LLM调用与管理工具-1

1. Creating keys

 


LiteLLM:统一调用多种大模型API的Python SDK,多平台LLM调用与管理工具-2

2. Adding models

 

LiteLLM:统一调用多种大模型API的Python SDK,多平台LLM调用与管理工具-3

3. Tracking expenditures

 

LiteLLM:统一调用多种大模型API的Python SDK,多平台LLM调用与管理工具-4

4. Configure load balancing

 

Function List

  • Multi-platform support: Supports multiple LLM providers such as OpenAI, Cohere, Anthropic, and more. Supports more than 100 big model API calls.
  • stable version: Provides stable Docker images that have been load-tested for 12 hours. Supports setting budget and request frequency limits.
  • proxy server: Unified invocation of multiple LLM APIs through a proxy server, and unified conversion of API format to OpenAI format.
  • Python SDK: A Python SDK is provided to simplify the development process.
  • Streaming Response: Support for streaming return model responses to enhance the user experience.
  • callback function: Supports multiple callbacks for easy logging and monitoring.

 

Using Help

Installation and Setup

  1. Installing Docker: Ensure that Docker is installed on your system.
  2. Pulling Mirrors: Use docker pull command pulls a stable image of LiteLLM.
  3. Starting a proxy server::
    cd litellm
    echo 'LITELLM_MASTER_KEY="sk-1234"' > .env
    echo 'LITELLM_SALT_KEY="sk-1234"' > .env
    source .env
    poetry run pytest .
    
  4. Configuring the Client: Set the proxy server address and API key in the code.
    import openai
    client = openai.OpenAI(api_key="your_api_key", base_url="http://0.0.0.0:4000")
    response = client.chat.completions.create(model="gpt-3.5-turbo", messages=[{"role": "user", "content": "Hello, how are you?"}])
    print(response)
    

Usage Functions

  1. invocation model: By model=<provider_name>/<model_name> Calling models from different providers.
  2. Streaming Response: Settings stream=True Get the streaming response.
    response = await acompletion(model="gpt-3.5-turbo", messages=messages, stream=True)
    for part in response:
    print(part.choices.delta.content or "")
    
  3. Setting Callbacks: Configure callback functions to log inputs and outputs.
    litellm.success_callback = ["lunary", "langfuse", "athina", "helicone"]

 

May not be reproduced without permission:Chief AI Sharing Circle " LiteLLM: Python SDK for Unified Calling of Multiple Big Model APIs, Multi-platform LLM Calling and Management Tools
en_USEnglish