General Introduction
Text2Voice is an open source tool that provides text-to-speech functionality based on a silicon-based mobility API, best characterized by a clean graphical user interface (GUI). It was created by developer Sheldon Lee on GitHub to allow users to easily turn text into speech through an interface. The project is developed in Python and combines the PyQt6 framework to create an intuitive interface. The core of the project is to use an API to turn text into audible audio in real time, supporting multiple languages such as Chinese and English, as well as the ability to choose different tones. The code is open, anyone can download, run or modify , suitable for people who want to quickly generate speech or developers. The project has a stable version , practical features , you can get started after installation .
Function List
- Convert Chinese, English and other multilingual text to speech through a graphical interface.
- Provides a wide selection of voice tones.
- Supports real-time audio playback control, including play, pause and stop.
- Displays a simple and beautiful graphical operation window.
- Automatic management of generated audio files.
- Support long text segmentation to speech.
Using Help
Text2Voice relies on Python and Silicon Flow APIs, and you need to install the environment and configure the keys before using it. Here are the detailed steps to help you get started quickly.
Installation process
- Preparing the system environment
Make sure your computer meets the requirements: Windows, macOS or Linux, 2GB or more of RAM, and a stable internet connection.- Installing Python: Accessing
https://www.python.org/
If you want to install Python, download version 3.8 or higher, and check the "Add Python to PATH" box during installation. - Installing Git: Visit
https://git-scm.com/
, download and install.
- Installing Python: Accessing
- Download Project Code
Open a terminal (CMD for Windows, Terminal for Mac/Linux) and run it:
git clone https://github.com/axdlee/text2voice.git
Then go to the project directory:
cd text2voice
- Setting up a virtual environment (recommended)
Create and activate virtual environments to avoid dependency conflicts:
python -m venv venv
- Windows.
venv\Scripts\activate
- Mac/Linux.
source venv/bin/activate
- Installation of dependencies
Project dependencies are listed in therequirements.txt
in the program, run the following command to install it:
pip install -r requirements.txt
This will install the necessary libraries such as PyQt6, Requests, Pygame, and so on.
- Configuring API Keys
In the project root directory, create the.env
file with the following contents:
SILICON_API_KEY=你的API密钥
The API key should be obtained from the Silicon Mobility website, filled in and saved.
- running program
Enter it in the terminal:
python main.py
After the program starts, a graphical interface appears.
How to use the main features
- Launching the graphical interface
(of a computer) runpython main.py
After that, you will see a window with a text input box and control buttons. - Setting the API Key
Click on the "Settings" button on the interface, and enter.env
Silicon Mobility API key in the file to save the settings. - input text
Type or paste the text you want to convert to speech in the text box, e.g. "Hello, this is a test". - Selecting a Tone
Pick a voice tone from the drop-down menu, such as male or female (the exact options are determined by the API). - convert to speech
Click on the "Convert to Speech" button and the program will process the text through the Silicon Mobility API to generate audio. - Play audio
After the conversion is finished, use the "Play" button on the interface to listen to the audio, which can be controlled by "Pause" or "Stop".
Featured Functions Operation Procedure
- GUI-based long text segmentation conversion
If the text exceeds 5000 words, the program will automatically process it in segments. Input the complete text directly in the interface, click "Convert to Voice", the program will generate audio one by one. You can use the play button to listen to each segment. - Audio File Management
The generated audio is temporarily stored in thetemp
folder. These files are automatically deleted when the program exits. If you want to save them, you can manually move them elsewhere before exiting. - Real-time playback control
The converted audio supports real-time operation. Click "Play" to start listening, "Pause" or "Stop" at any time, the operation is done in the graphical interface.
caveat
- The network has to be stable because the functionality relies on the silicon-based mobility API.
- It is recommended that a single conversion should not exceed 5000 characters to avoid API errors.
- API keys should be kept secret and not shared publicly.
- If the interface doesn't respond, check that the key, network and dependencies are correct.
With these steps, you can convert text to speech using Text2Voice's graphical interface. Developers can also modify the code to adjust the interface or functionality.
application scenario
- Educational aids
Converts text to speech with a graphical interface for easy listening and learning. - content creation
Generate speech for videos or podcasts with easy, time-saving operation. - Accessibility support
Help visually impaired people access information by converting text to speech through an interface.
QA
- What languages are supported?
Multiple languages are supported, including Chinese and English, as determined by the Silicon Mobility API. - Why is the interface not responding?
It could be an API key error, a network issue, or a dependency that wasn't installed properly. Check and retry. - Where are the audio files stored?
Temporarily stored intemp
folder, which is automatically cleaned up after the program is closed.