General Introduction
MiniMax Audio is an AI speech generation tool from MiniMax, with the core feature of quickly converting text into highly similar natural speech. It is based on the Speech-02 model, with a speech synthesis similarity of up to 99%, studio-grade sound quality, and support for over 30 languages and multiple accents. Users can enter text, upload files or links to easily generate audio, which is suitable for producing audiobooks, podcasts and other content.
The domestic version (hailuoai.com) and the international version (minimax.io) are similar in functionality, with the international version focusing more on ultra-long text processing (up to 200,000 characters) and offering flexible subscription plans. With a daily gift of 4,000 credits (about 5 minutes of audio generation), the paid version supports commercial use and more features, and is easy to operate, making it popular with creators and developers.
The domestic version is currently free for a limited time with no restrictions on use.
-
Conch Voice (domestic version)
-
MiniMax Audio (international version)
Function List
- text-to-speech: Input text, quickly generate natural speech, support multiple languages and tones.
- voice cloning: Upload 10 seconds of audio to replicate a highly similar sound.
- Extra-long text processingThe international version supports 200,000 characters at a time, while the domestic version is limited to 5,000 or 10,000 characters.
- Documentation and Linking Support: Upload a file or enter a URL to extract text to generate audio.
- emotional control: Adjust voice emotions such as happy, calm (paid version supports more options).
- multilingual coverage: More than 30 languages are supported, and the free version is limited to 16.
- History Management: View, delete, or organize generated records.
- API Integration: Provide developer interfaces to embed other applications.
Using Help
MiniMax Audio does not require installation and operates directly from the web. The domestic version and international version are basically the same way to use, the following is a detailed guide.
How to get started
- Visit the domestic version at https://hailuoai.com/audio or the international version at https://www.minimax.io/audio.
- Click "Login" to register or log in with your email address.
- After logging in, you will enter the main screen, which contains text input boxes and function options.
Basic operations for generating speech
- input: Enter something in the text box, such as "Welcome to MiniMax Audio".
- Select Language and Tone: Select a language (e.g. "Chinese") and a tone (e.g. "Male voice - low").
- Generate Audio: Click on "Generate" and listen to or download the MP3 file in a few seconds.
- View Consumption: The international version shows credits (1 English character = 1 point, 1 Chinese character = 2 points), which is consistent with the domestic version.
Using files or links
- Uploading filesClick "Upload File" to support TXT, PDF, etc. and extract text automatically.
- Enter link: Paste the URL of the web page and click "Load" to get the content.
- Generation process: Confirm the text and click "Generate" to download the audio.
Voice cloning function
- Preparation of samples: Record more than 10 seconds of clear audio and save it as MP3 or WAV.
- Upload and create: Upload in the "Voice Clone" option and click on "Create Voice".
- Application Cloning: Select a new tone and enter text to generate audio.
- Description of restrictionsThe free version is limited to 3 clones, the Starter version to 10 clones and the Standard version to 100 clones.
Text length and credits
- international edition: 200,000 characters at a time, asynchronous processing of long text.
- domestic version: HD mode is limited to 5,000 characters, Turbo mode to 10,000 characters.
- International version creditsThe free version gives 4,000 points per day (about 5 minutes of audio), the Starter version 100,000 points per month (about 2 hours), and the Standard version 1,000,000 points (about 20 hours).
Subscriptions & Top-ups (International Version)
- free version: Approximately 2.5 hours of audio per month, limited to 16 languages.
- Starter Edition: $5/month, about 4.5 hours, faster generation, supports commercial use.
- Standard Edition: $30/month for about 22.5 hours, with a higher cloning cap.
- recharge (money onto a card): $30 per 1 million points, $5 minimum, without subscription.
API Usage
- Get the key: Log in and apply at https://www.minimax.io/platform/document/T2AV2 or at the domestic API page.
<API Key>
The - Recall Example::
curl -X POST https://api.minimax.io/audio \
-H "Authorization: Bearer <API Key>" \
-H "Content-Type: application/json" \
-d '{"text": "你好,这是测试", "language": "zh", "voice": "female_gentle"}'
- file address: See the above link for the international version and https://hailuoai.com/api for the domestic version.
Instructions for use
- international edition: Source (personal use) is required and commercial use requires a Starter or Standard subscription.
- Optimization Recommendations: Adjust timbre or segment generation when audio is poor.
Simple to operate, you can get started in a few minutes, suitable for a variety of needs.
application scenario
- Audiobook production
Convert long texts to audio to generate audiobooks to share or publish. - podcast production
Enter scripts to quickly generate podcasts and save recording time. - Educational aids
Converts course materials to audio for easy listening or to assist the visually impaired. - game dubbing
Use voice cloning to generate unique voices for your characters to enhance the experience.
QA
- What is the difference between the domestic and international versions?
The international version supports 200,000 characters of very long text, while the domestic version is limited to 5,000 or 10,000 characters but is free for a limited time. - How long does the free international version last?
4,000 points per day, approximately 5 minutes of audio, up to 2.5 hours per month. - What languages are supported?
More than 30 types, the free version is limited to 16 types, such as Chinese, English and so on. - How long of audio is needed for voice cloning?
Minimum 10 seconds of clear audio. - Is it commercially available?
International editions require a Starter or Standard subscription; domestic editions are not explicitly limited.