AI Personal Learning
and practical guidance

Linly-Dubbing: Intelligent Video Multilingual AI Dubbing/Translation Tool

General Introduction

Linly-Dubbing is an intelligent multilingual AI dubbing and translation tool designed to provide users with high-quality multilingual video dubbing and subtitle translation services by integrating advanced AI technology. The tool is particularly suitable for scenarios such as international education, global content localization, etc., helping teams spread high-quality content across the globe.

Linly-Dubbing: Intelligent Video Multilingual AI Dubbing/Translation Tool


 

Function List

  • Multi-language support: Provides dubbing and subtitling translations in Chinese and many other languages to meet globalization needs.
  • AI speech recognition: Speech-to-text conversion and speaker recognition using advanced AI technology.
  • Large Language Model Translation: Combined with leading-edge language modeling (e.g., GPT), translations are performed quickly and accurately, ensuring professionalism and naturalness.
  • AI voice cloning: Utilizes cutting-edge voice cloning technology to generate a voice that is highly similar to the original video dub, maintaining emotional and intonational coherence.
  • Digital human-to-human lip-synching technology: Through lip-synching technology, the voice-over is highly compatible with the video screen, enhancing the sense of realism and interactivity.
  • Flexible Uploading and Translation: Users can upload videos and choose their own translation language and standard, ensuring personalization and flexibility.
  • regular update: Continuously introducing the latest models to stay at the forefront of dubbing and translation.

 

Using Help

Installation process

  1. clone warehouse: First, clone the Linly-Dubbing repository to your local machine and initialize the submodules.
    git clone https://github.com/Kedreamix/Linly-Dubbing.git --depth 1
    cd Linly-Dubbing
    git submodule update --init --recursive
    
  2. Installation of dependencies: Create a new Python environment and install the required dependencies.
    conda create -n linly_dubbing python=3.10 -y
    conda activate linly_dubbing
    cd Linly-Dubbing/
    conda install ffmpeg==7.0.2 -c conda-forge
    python -m pip install --upgrade pip
    pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
    pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu118
    pip install -r requirements.txt
    pip install -r requirements_module.txt
    
  3. Configuring Environment Variables: Create the .env file in the project root directory and fill in the necessary environment variables.
    OPENAI_API_KEY=sk-xxx
    MODEL_NAME=gpt-4
    HF_TOKEN=your_hugging_face_token
    
  4. Running the application: Download the required model and launch the WebUI interface.
    bash scripts/download_models.sh
    python webui.py
    

Usage Process

  1. Upload Video: Users can upload video files to be dubbed or translated through the WebUI interface.
  2. Selection of language and criteria: After uploading the video, the user can select the language to be translated and the dubbing standard.
  3. Generate voiceovers and subtitles: The system will automatically perform speech recognition, translation and dubbing generation, and synchronize the generation of subtitle files.
  4. Download results: Users can download the generated dubbed video and subtitle files for subsequent editing and use.

Main Functions

  • Automatic video download: Use the yt-dlp utility to download video and audio in a variety of formats and resolution options.
  • vocal separation: Vocal and backing track separation using Demucs and UVR5 technology to produce high quality backing track and vocal extracts.
  • AI speech recognition: Accurate speech recognition and subtitle generation using WhisperX and FunASR, with support for multi-speaker recognition.
  • Large Language Model Translation: High-quality multilingual translations combining the OpenAI API and the Qwen model.
  • AI speech synthesis: Generate natural, smooth voice output with Edge TTS and CosyVoice, supporting multiple languages and voice styles.
  • Video Processing: Provide functions such as adding subtitles, inserting background music, adjusting volume and modifying playback speed to personalize the video content.
  • Digital human-to-human lip-synching technology: Digital human-to-digital lip-synchronization through Linly-Talker technology to enhance the professionalism of the video and the viewing experience.
May not be reproduced without permission:Chief AI Sharing Circle " Linly-Dubbing: Intelligent Video Multilingual AI Dubbing/Translation Tool

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish