AI Personal Learning
and practical guidance

Voice-Pro: open source multifunctional video translation tool, voice transcription and translation into multiple languages, Windows one-click installation

General Introduction

Voice-Pro is a multifunctional tool based on Gradio WebUI that supports speech-to-text, text-to-speech, real-time translation, YouTube video downloads, and human voice separation. It integrates Whisper, Faster-Whisper and Whisper-Timestamped technologies to provide efficient audio processing and translation for multiple languages and scenarios.

Voice-Pro: Translate audio files, download YouTube videos, speech-to-text, text-to-speech, real-time translation-1


 

Voice-Pro: Translate audio files, download YouTube videos, speech-to-text, text-to-speech, real-time translation-1

 

Function List

  • speech-to-text: Supports Whisper, Faster-Whisper, and Whisper-Timestamped to provide highly accurate speech recognition.
  • text-to-speech: Supports Edge-TTS and F5-TTS with multiple language and voice choices, speed, volume and pitch adjustments.
  • real time translation: Supports real-time speech recognition and translation for multiple languages.
  • YouTube Download: You can download YouTube videos and extract audio (mp3, wav, flac).
  • vocal separation: Vocal and background sound separation using MDX-Net and Demucs engines.
  • batch file: Supports subtitle generation, translation and text-to-speech processing of large batches of files.
  • Subtitle Generation: Supports generation and editing of subtitles in more than 90 languages.
  • Multi-format support: All video and audio formats supported by ffmpeg are supported.

 

Using Help

Installation process

  1. starter pack: Clone or download the latest version of the source code from GitHub.
    git clone https://github.com/abus-aikorea/voice-pro.git
  1. Install and run the program::
    • (of a computer) run configure.bat Install the required dependencies (e.g. git, ffmpeg and CUDA).
    • (of a computer) run start.bat Start Voice-Pro and WebUI will run automatically.
    • When run for the first time, Voice-Pro will first install, which may take an hour or more, during which time do not close the Windows command window.

Usage Functions

  1. speech-to-text::
    • Select the Whisper model and calculation type in the Studio tab.
    • Upload an audio file or select an audio input source (such as a microphone).
    • Click the "Start" button and wait for the speech recognition and subtitle creation to complete.
  2. rendering::
    • Upload the text or subtitle file to be translated in the Translate tab.
    • Select the target language and click the "Translate" button.
    • Once the translation is complete, you can download the translated file.
  3. text-to-speech::
    • Select Edge-TTS or F5-TTS in the TTS tab.
    • Enter the text to be converted and select the speech parameters (e.g. speed, volume, pitch).
    • Click the "Generate Voice" button and wait for the voice generation to complete.
  4. YouTube Download::
    • Enter the YouTube video link in the YouTube Downloader tab.
    • Select the audio format (mp3, wav, flac) and click the "Download" button.
    • Once the download is complete, you can find the audio file in the specified folder.
  5. sound separation::
    • Upload audio files in the Vocal Remover tab.
    • Select the MDX-Net or Demucs engine and click on the Start button.
    • Wait for the sound separation to complete and you can download the separated audio file.
  6. batch file::
    • Upload multiple files in the Batch tab.
    • Select the desired operation (subtitling, translation, text-to-speech).
    • Click the "Start" button and wait for batch processing to complete.

common problems

  • Browser not running automatically: Close the Windows command window and re-run start.bat, or manually enter the displayed address in your browser (e.g. http://127.0.0.1:7892).
  • CUDA Out of Memory Error: Check the GPU memory status to adjust the noise reduction level or calculation type.
  • Windows Defender Warning: Add the batch file as an exception or temporarily disable Windows Defender.
AI Easy Learning

The layman's guide to getting started with AI

Help you learn how to utilize AI tools at a low cost and from a zero base.AI, like office software, is an essential skill for everyone. Mastering AI will give you an edge in your job search and half the effort in your future work and studies.

View Details>
May not be reproduced without permission:Chief AI Sharing Circle " Voice-Pro: open source multifunctional video translation tool, voice transcription and translation into multiple languages, Windows one-click installation

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish