AI Personal Learning
and practical guidance

AI no jimaku gumi: Automatic generation and translation of multilingual subtitles for videos with the help of AI

General Introduction

AI no jimaku gumi (AI no subtitle group) is a powerful command line video subtitle processing tool focused on implementing automated video subtitle extraction, transcription and translation functions. The tool integrates advanced AI technologies, including Whisper Speech recognition models and a variety of translation backends (e.g. DeepL, LLM, etc.) enable efficient processing of video and audio content and generation of high-quality subtitle files. It supports conversion between multiple languages, including English, Japanese, Chinese, Korean and other mainstream languages, and provides flexible subtitle output options. As an open source project, it not only provides the complete source code, but also supports cross-platform use and can run on Linux, macOS and other major operating systems.

 

Function List

  • Automatically extracts audio content from video and recognizes speech
  • Supports multiple subtitle sources: audio recognition, container extraction, OCR recognition
  • Integration with multiple translation backends: DeepL, LLM, etc.
  • Support for translation from and to many mainstream languages
  • Configurable subtitle output format (SRT format currently supported)
  • Support video clip interception and processing
  • Provides debugging modes: audio extraction only, transcription only, translation only, and other options
  • Support for customizing AI model paths and configurations
  • Cross-platform support (Linux, macOS, Windows to be supported)

 

Using Help

1. Environmental preparation

Windows systems in preparation...

 

Linux system installation dependencies:

  • Ubuntu users:
apt-get install -y clang cmake make pkg-config libavcodec-dev libavdevice-dev libavfilter-dev libavformat-dev libavutil-dev libpostproc-dev libswresample-dev libswscale-dev
  • Fedora users:
dnf install clang cmake ffmpeg-free-devel make pkgconf-pkg-config
  • Arch Linux users:
pacman -S clang cmake ffmpeg make pkgconf

macOS system installation dependencies:

Use the Homebrew package manager:

brew install cmake ffmpeg

2. Installation steps

  1. Clone the code repository:
git clone https://github.com/Inokinoki/ai-no-jimaku-gumi.git
cd ai-no-jimaku-gumi
  1. Compile the project:
cargo build
  1. Download the Whisper model:
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-tiny.bin

3. Basic use

The tool offers several configuration options:

  • --input-video-path: Specify the input video file path (required)
  • --source-language: Source language (default: ja)
  • ---target-language: Target language (default: en)
  • --ggml-model-path: AI model path (default: ggml-tiny.bin)
  • --subtitle-output-path: Subtitle output path (default: output.srt)

4. Translation back-end configuration

DeepL translation backend (default):

  1. Setting environment variables:
export DEEPL_API_KEY=Your API key
export DEEPL_API_URL=https://api.deepl.com # Required for paid API version

LLM Translation Backend:

  1. Setting environment variables:
export CUSTOM_API_KEY=sk-xxxxxxxxxxxxxxxxxxxxxxxxx
  1. Example of use:
. /target/debug/ainojimakugumi --input-video-path video.webm \
---translator-backend llm \
--llm-api-base https://your-api-endpoint.com/v1/ \
--llm-prompt 'translate this to English' \
--llm-model-name 'gpt-4o-mini' \
--ggml-model-path ggml-small.bin

5. Advanced functions

  • utilization--start-timecap (a poem)--end-timeCan process specific video clips
  • --only-extract-audio: Extract audio only (for debugging)
  • --only-transcript: Generate subtitles in the original language only
  • --only-translate: Performs translation functions only
  • Supports multiple subtitle source selection: audio (default), container, ocr

May not be reproduced without permission:Chief AI Sharing Circle " AI no jimaku gumi: Automatic generation and translation of multilingual subtitles for videos with the help of AI

Chief AI Sharing Circle

Chief AI Sharing Circle specializes in AI learning, providing comprehensive AI learning content, AI tools and hands-on guidance. Our goal is to help users master AI technology and explore the unlimited potential of AI together through high-quality content and practical experience sharing. Whether you are an AI beginner or a senior expert, this is the ideal place for you to gain knowledge, improve your skills and realize innovation.

Contact Us
en_USEnglish