General Introduction
Gemini Balance is an OpenAI API proxy service developed based on the FastAPI framework, aiming to provide efficient multi-API Key management and optimization features. The project supports Gemini model calls, and its main features include multi-API Key polling, authentication forensics, streaming responses, CORS cross-domain support, and health check interfaces. By using technology stacks such as Python 3.9+ and Docker, Gemini Balance provides developers with a flexible and efficient API proxy solution for application scenarios requiring high concurrency and high reliability.
Function List
- Multi-API Key Polling Support
- Bearer Token Authentication
- Streaming response support
- CORS cross-domain support
- Health Check Interface
- Support for Gemini model calls
- Support for search function
- Support for Code Execution
Using Help
Environmental requirements
- Python 3.9+
- Docker (optional)
Installation of dependencies
pip install -r requirements.txt
configuration file
establish.env
file and configure the following parameters:
API_KEYS=["your-api-key-1", "your-api-key-2"]
ALLOWED_TOKENS=["your-access-token-1", "your-access-token-2"]
BASE_URL="https://generativelanguage.googleapis.com/v1beta"
TOOLS_CODE_EXECUTION_ENABLED=true
MODEL_SEARCH=["gemini-2.0-flash-exp"]
Docker Deployment
docker build -t gemini-balance .
docker run -p 8000:8000 -d gemini-balance
API interface
- Getting a list of models
GET /v1/models
Authorization: Bearer your-token
- Chat complete.
POST /v1/chat/completions
Authorization: Bearer your-token
{
"messages": [...] ,
"model": "gemini-1.5-flash-002",
"temperature": 0.7,
"stream": false, "tools": [].
"tools": []
}
- Get Embedding
POST /v1/embeddings
Authorization: Bearer your-token
{
"input": "Your text here", "model": "text-embedding-004", "text-embedding-004
"model": "text-embedding-004"
}
- health checkup
GET /health
code structure
app/
api/
routes.py
: API Routingdependencies.py
: Dependency Injectioncore/
config.py
: Configuration managementsecurity.py
: Security certificationservices/
chat_service.py
: Chat Servicekey_manager.py
: Key Managementmodel_service.py
: Modeling servicesschemas/
request_model.py
: Request modelmain.py
: Main Program Entry
Dockerfile
: Docker Configurationrequirements.txt
: Project dependencies
Safety Features
- API Key Polling Mechanism
- Bearer Token Authentication
- Request logging
- Failure Retry Mechanism
- Key validity check
caveat
- Be sure to keep your API Keys and access tokens in a safe place!
- It is recommended to configure sensitive information in production environments using environment variables
- The default service port is 8000
- The default number of API Key failure retries is 10.
- See the Gemini API documentation for a list of supported models
Addendum: huggingface deploys gemini agent, account polling calls, unlocking area restrictions
1. Copy space space
Gemini Balance - a Hugging Face Space by snailyp
2. Modify the visibility toPublic
(be sure to note that the change to public, or not accessible), configurationALLOWED_TOKENS
,API_KEYS
,BASE_URL
default (setting)
ALLOWED_TOKENS
format["Customize apikey"]
Note that parentheses, commas, and quotation marks are strictly adhered to.
API_KEYS
The format is in the form of a single key:["gemini_key1"]
The form of multiple keys["gemini_key1", "gemini_key2"]
Note that parentheses, commas, and quotation marks are strictly adhered to.
BASE_URL
leave the defaults as they are
3. Wait for the deployment to be successful, after the deployment is successful, the following logs and running status will appear
4. At this point the default host of the huggingface service ishuggingface username-gemini-balance.hf.space
Like mine is.snailyp-gemini-balance.hf.space
take note of: Huggingface service will enter sleeping if it is not used for 48h, it is recommended to keep it alive by timed tasks such as green dragon panel or uptime kuma. (Called directly by get request)https://用户名-gemini-balance.hf.space
(can be done)
5. Supported endpoints/hf/v1/models
cap (a poem)/hf/v1/chat/completions
6.huggingface domain name may not be able to directly access the domestic, hf.space seems to be able to access, if you can not access, you can refer to the following process.
You can use cf workers as a proxy and bind a customized domain name, then you can access it in China, cf worker proxy code is as follows.Change url.host to your own.::
export default { async fetch(request, env) { const url = new URL(request.url); url.host = 'xxx-gemini-balance.hf.space'; url.pathname = "/hf" + url. pathname; return fetch(new Request(url, request)) } }
take note of: The proxied endpoint removes /hf, so the endpoint now becomes/v1/models
cap (a poem)/v1/chat/completions
The client configuration requires some attention.
7. Some limitations, at present it is not possible to do image processing and structure words output