AI Personal Learning
and practical guidance
CyberKnife Drawing Mirror

Text2Voice: A Text-to-Speech Graphical Interface Based on Silicon Flow APIs

General Introduction

Text2Voice is an open source tool that provides text-to-speech functionality based on a silicon-based mobility API, best characterized by a clean graphical user interface (GUI). It was created by developer Sheldon Lee on GitHub to allow users to easily turn text into speech through an interface. The project is developed in Python and combines the PyQt6 framework to create an intuitive interface. The core of the project is to use an API to turn text into audible audio in real time, supporting multiple languages such as Chinese and English, as well as the ability to choose different tones. The code is open, anyone can download, run or modify , suitable for people who want to quickly generate speech or developers. The project has a stable version , practical features , you can get started after installation .

Text2Voice: A Text-to-Speech Graphical Interface Based on Silicon Flow API-1


 

Function List

  • Convert Chinese, English and other multilingual text to speech through a graphical interface.
  • Provides a wide selection of voice tones.
  • Supports real-time audio playback control, including play, pause and stop.
  • Displays a simple and beautiful graphical operation window.
  • Automatic management of generated audio files.
  • Support long text segmentation to speech.

 

Using Help

Text2Voice relies on Python and Silicon Flow APIs, and you need to install the environment and configure the keys before using it. Here are the detailed steps to help you get started quickly.

Installation process

  1. Preparing the system environment
    Make sure your computer meets the requirements: Windows, macOS or Linux, 2GB or more of RAM, and a stable internet connection.

    • Installing Python: Accessing https://www.python.org/If you want to install Python, download version 3.8 or higher, and check the "Add Python to PATH" box during installation.
    • Installing Git: Visit https://git-scm.com/, download and install.
  2. Download Project Code
    Open a terminal (CMD for Windows, Terminal for Mac/Linux) and run it:
git clone https://github.com/axdlee/text2voice.git

Then go to the project directory:

cd text2voice
  1. Setting up a virtual environment (recommended)
    Create and activate virtual environments to avoid dependency conflicts:
python -m venv venv
  • Windows.
    venv\Scripts\activate
    
  • Mac/Linux.
    source venv/bin/activate
    
  1. Installation of dependencies
    Project dependencies are listed in the requirements.txt in the program, run the following command to install it:
pip install -r requirements.txt

This will install the necessary libraries such as PyQt6, Requests, Pygame, and so on.

  1. Configuring API Keys
    In the project root directory, create the .env file with the following contents:
SILICON_API_KEY=你的API密钥

The API key should be obtained from the Silicon Mobility website, filled in and saved.

  1. running program
    Enter it in the terminal:
python main.py

After the program starts, a graphical interface appears.

How to use the main features

  1. Launching the graphical interface
    (of a computer) run python main.py After that, you will see a window with a text input box and control buttons.
  2. Setting the API Key
    Click on the "Settings" button on the interface, and enter .env Silicon Mobility API key in the file to save the settings.
  3. input text
    Type or paste the text you want to convert to speech in the text box, e.g. "Hello, this is a test".
  4. Selecting a Tone
    Pick a voice tone from the drop-down menu, such as male or female (the exact options are determined by the API).
  5. convert to speech
    Click on the "Convert to Speech" button and the program will process the text through the Silicon Mobility API to generate audio.
  6. Play audio
    After the conversion is finished, use the "Play" button on the interface to listen to the audio, which can be controlled by "Pause" or "Stop".

Featured Functions Operation Procedure

  • GUI-based long text segmentation conversion
    If the text exceeds 5000 words, the program will automatically process it in segments. Input the complete text directly in the interface, click "Convert to Voice", the program will generate audio one by one. You can use the play button to listen to each segment.
  • Audio File Management
    The generated audio is temporarily stored in the temp folder. These files are automatically deleted when the program exits. If you want to save them, you can manually move them elsewhere before exiting.
  • Real-time playback control
    The converted audio supports real-time operation. Click "Play" to start listening, "Pause" or "Stop" at any time, the operation is done in the graphical interface.

caveat

  • The network has to be stable because the functionality relies on the silicon-based mobility API.
  • It is recommended that a single conversion should not exceed 5000 characters to avoid API errors.
  • API keys should be kept secret and not shared publicly.
  • If the interface doesn't respond, check that the key, network and dependencies are correct.

With these steps, you can convert text to speech using Text2Voice's graphical interface. Developers can also modify the code to adjust the interface or functionality.

 

application scenario

  1. Educational aids
    Converts text to speech with a graphical interface for easy listening and learning.
  2. content creation
    Generate speech for videos or podcasts with easy, time-saving operation.
  3. Accessibility support
    Help visually impaired people access information by converting text to speech through an interface.

 

QA

  1. What languages are supported?
    Multiple languages are supported, including Chinese and English, as determined by the Silicon Mobility API.
  2. Why is the interface not responding?
    It could be an API key error, a network issue, or a dependency that wasn't installed properly. Check and retry.
  3. Where are the audio files stored?
    Temporarily stored in temp folder, which is automatically cleaned up after the program is closed.
May not be reproduced without permission:Chief AI Sharing Circle " Text2Voice: A Text-to-Speech Graphical Interface Based on Silicon Flow APIs
en_USEnglish