VideoLingo: video transcription word-level timeline subtitles, video subtitle translation and localized dubbing open source tools

Latest AI Resources10mos agoupdate AI Sharing Circle

1.8K 00

General Introduction

VideoLingo is a one-stop video translation and localization dubbing tool designed to generate Netflix-quality subtitles, eliminating rigid machine translation and multi-line subtitles, and adding high-quality voiceovers so that global knowledge can be shared across language barriers. With the intuitive Streamlit web interface, users can easily create Netflix-quality localized videos in just two clicks, from the video link to embedded high-quality bilingual subtitles and even voiceovers.

A more integrated tool, but the transcription process is not as detailed as this tool:pyvideotrans: Video Translation Dubbing Tool

VideoLingo：视频转录单词级时间轴字幕，视频字幕翻译和本地化配音开源工具

Function List

Downloading videos from YouTube links using yt-dlp
Word-level Timeline Caption Recognition with WhisperX
Subtitle segmentation based on sentence meaning using NLP and GPT
GPT summarizes and extracts the terminology knowledge base, and translates consistently in context
Three-step direct translation, reflection, and meaning, comparable to the subtitle team's fine-tuning effect
Checked for single line lengths as per Netflix standards, no double line subtitles ever!
High-quality aligned dubbing using methods such as GPT-SoVITS
Integration pack one-click start, one-click out in Streamlit

Using Help

Installation process

Download the one-click integration package for VideoLingo (about 800M).
Unzip it and double-click to run the "One-Click-Boot.bat" in the folder.
In the open browser window, make the necessary configurations in the sidebar, and then one-click out the movie.

Functional operation flow

Video Download: Enter the YouTube video link in the Streamlit interface and use yt-dlp to download the video.
subtitle recognition: Use WhisperX for word-level timeline subtitle recognition to ensure that subtitles are accurately aligned with video content.
subtitle split: Utilizes NLP and GPT technologies to segment subtitles based on sentence meaning and generate Netflix-compliant one-line subtitles.
rendering: GPT summarizes and extracts the terminology knowledge base, and performs contextualized translations to ensure that the translated content flows naturally.
dubbing (filmmaking): High-quality aligned dubbing using methods such as GPT-SoVITS to generate dubbing effects that are highly consistent with the original video content.
One-click film output: Generate videos with high-quality bilingual subtitles and dubbing with a single click after all the configurations are done in the Streamlit interface.