General Introduction
AI2SRT is an open source project that utilizes the GeminiAI Big Model to generate short narrated videos and video summaries for long videos with one click, while supporting audio and video transcription subtitles. The project aims to simplify the video content creation process and provide efficient subtitle generation and translation functions. Users can quickly convert long videos into short videos and generate corresponding subtitle files through simple operations, which are suitable for a variety of scenarios, such as education, entertainment and business promotion.
Designed with a web interface, the tool is simple and intuitive to use, and supports multiple platforms, including Windows, Linux and Mac. With Gemini's powerful ability to intelligently understand video content, generate professional narration copy, and support high-quality subtitle translation using the three-step reflection method, the tool is a powerful assistant for video creators and content editors. As pyVideoTrans Matching tools are used.
Function List
- Long video one click to create AI commentary short video function
- Intelligent generation of video content summary reports
- Support three-step reflection method for translating SRT subtitle files
- Automatic transcription of audio and video files to SRT subtitles
- Web interface operation, support cross-platform use
- Support custom AI prompt words to optimize the output effect
- Integrated GeminiAI interface, support gemini-1.5-flash and other models
Using Help
1. Environmental preparation
Before you start using ai2srt, you need to ensure the following conditions:
- Stable web proxy environment (required)
- GeminiAI API key (free to apply)
- Choosing the right operating system version
2. Installation and deployment
Rapid deployment for Windows users:
- Download the latest pre-packaged version from the GitHub Releases page
- Extract the downloaded zip (e.g. window-gemini-video-tools-0.3.7z)
- Double-click the "startup.bat" file to run the program.
- The program will automatically open the operator interface in your browser: http://127.0.0.1:5030
Deployment steps for Linux/Mac users:
- Clone the code repository:
git clone https://github.com/jianchang512/ai2srt
- Go to the project catalog:
cd ai2srt
- Create and activate a virtual environment:
python3 -m venv venv
source . /venv/bin/activate
- Install the dependency packages:
pip3 install -r requirements.txt
- Launch the application:
python3 app.py
3. Functional utilization statement
3.1 Creating Video Narration
- Upload long video files to be processed in the web interface
- Select the "Create Narration Video" function.
- Configure GeminiAI parameters and cue words (optional)
- Click Start Processing and wait for the AI to analyze and generate the commentary
- The system will automatically synthesize a new video with narration
3.2 Subtitle transcription and translation
- Uploading audio and video files to be processed
- Select the "Subtitle Transcription" or "Subtitle Translation" function.
- For the translation function, a three-step reflective approach can be used to ensure translation quality
- Wait for processing to complete and then download the resulting SRT subtitle file.
3.3 Video Summary Generation
- Upload video file
- Selecting the "Video Summary" function
- Waiting for AI to analyze video content and generate summary reports
4. Cautions
- Ensure that the network proxy is stable during use, this is the key to ensure that the tool works properly
- It is recommended to use the gemini-1.5-flash model, which has a high free usage quota
- AI output can be optimized by adjusting cue words
- If you encounter a processing failure, first check the web agent status
- Take care to keep the program version up-to-date for the latest features and optimizations