General Introduction
ElevenLabs MCP ElevenLabs is an official open source project hosted on GitHub. It is a server tool based on the Model Context Protocol (MCP) designed to connect AI models with ElevenLabs' speech and audio processing capabilities. With this tool, users can convert text to natural speech, clone personalized voices, transcribe audio, and even create conversational AI agents. It supports collaboration with Claude Client integrations such as Desktop, Cursor, Windsurf, and others allow developers to run servers locally and process audio tasks through ElevenLabs' cloud-based API. The free tier of the program offers 10,000 credits per month for personal testing, while the paid plan supports larger scale use.
Function List
- text-to-speech: Turn text into smooth, natural speech, supporting a wide range of tones and languages.
- voice cloning: Generate unique AI sounds from audio samples.
- speech-to-text: Convert audio files to text with support for multi-speaker recognition.
- Conversational AI:: Create intelligent agents that can communicate by voice and can be used for outbound phone calls, etc.
- audio processing: Provides useful functions such as sound isolation and sound quality enhancement.
- Local Server Support: Run a server on the user's device to connect to the cloud API.
Using Help
Installation process
To use ElevenLabs MCP, you need to install and configure the server locally. The following are the detailed steps:
- Preparing the environment
- Make sure your computer has Python 3.8 or later installed. Use the command
python --version
Check the version. - Get ElevenLabs API key. Access ElevenLabs Official Website, register and find the key on the Settings page.
- Recommended Installation
uv
(Python Package Manager). Use the commandcurl -LsSf https://astral.sh/uv/install.sh | sh
Installation, or refer to uv WarehouseThe
- Make sure your computer has Python 3.8 or later installed. Use the command
- Download Project
- Open a terminal and type:
git clone https://github.com/elevenlabs/elevenlabs-mcp.git cd elevenlabs-mcp
- Open a terminal and type:
- Installation of dependencies
- expense or outlay
uv
Installation:uv pip install -r requirements.txt
- or with the default
pip
::pip install -r requirements.txt
- expense or outlay
- Configuring API Keys
- Method 1: Enter the key at runtime:
python -m elevenlabs_mcp --api-key=你的API密钥
- Method 2: Setting environment variables. Type in the terminal:
export ELEVENLABS_API_KEY=你的API密钥
Then run:
python -m elevenlabs_mcp
- Method 1: Enter the key at runtime:
- Start the server
- The default address is
http://127.0.0.1:8000
. If there is a port conflict, use the--port
Modification:python -m elevenlabs_mcp --port=8080
- The default address is
- Connecting Clients
- Claude Desktop
- Open Claude Desktop and click on Menu > Help > Enable Developer Mode in the upper left corner (Windows users need to enable it).
- Go to Settings > Developer > Edit Config to add a configuration:
{ "mcpServers": { "ElevenLabs": { "command": "uvx", "args": ["elevenlabs-mcp"], "env": { "ELEVENLABS_API_KEY": "你的API密钥" } } } }
- Save and restart Claude.
- Other clients (e.g. Cursor, Windsurf)
- Install the package:
pip install elevenlabs-mcp
- Run and get the configuration:
python -m elevenlabs_mcp --api-key=你的API密钥 --print
- Paste the output configuration into the specified directory on the client.
- Install the package:
- Claude Desktop
Functional operation flow
text-to-speech
- Type in Claude: "Generate speech: 'Hello, I'm ElevenLabs' with ElevenLabs."
- Server generates audio and returns it, Claude plays it automatically. Tones can be specified, e.g. "voice: Adam".
voice cloning
- Prepare 2-3 clear audio samples (e.g.
sample.wav
). - Enter: "Clone a voice with [sample.wav]."
- The server returns a voice ID, which is used to generate a new voice.
speech-to-text
- Upload an audio file (e.g.
audio.mp3
). - Type, "Transcribe audio: audio.mp3 with ElevenLabs."
- Returns the transcribed text and recognizes different speakers if there are multiple speakers.
Conversational AI
- Enter: "Create an AI agent that speaks like a detective and answers movie questions."
- The server generates agents with speech that can be interacted with via text or voice.
audio processing
- Enter: "Isolate voice from background noise in audio.mp3."
- Returns the processed audio file.
Debugging and Logging
- Log Location:
- Windows.
%APPDATA%\Claude\logs\mcp-server-elevenlabs.log
- macOS.
~/Library/Logs/Claude/mcp-server-elevenlabs.log
- Windows.
- Timeout issues: operations such as voice design are time-consuming and may time out in development mode, but the task will still be completed.
Common Error Resolution
- "spawn uvx ENOENT".
- probe
uvx
Path:which uvx
- Update the configuration, e.g.
"command": "/usr/local/bin/uvx"
The
- probe
application scenario
- content creation
- Podcast producers use text-to-speech to generate narration or clone their own voices to produce audio in bulk.
- Education and training
- Teachers convert lesson text to speech to create audio learning materials.
- client service
- Enterprises create voice customer service with conversational AI to handle common inquiries.
- game development
- Developers generate unique voices for characters to enhance immersion.
QA
- Do I have to pay for it?
- The free tier offers 10,000 points/month, beyond which you need to purchase a paid plan.
- What languages are supported?
- Support English, Chinese and other languages, see ElevenLabs official website.
- How do I check usage?
- Log in to the ElevenLabs website and view point consumption on your account page.