AI Personal Learning
and practical guidance
TRAE

BiliNote: The AI tool that automatically generates Markdown notes from videos

General Introduction

BiliNote is an open source AI video note-taking tool that supports extracting content from BiliBili and YouTube video links to automatically generate clearly structured notes in Markdown format. It uses local audio transcription and a variety of big models (such as OpenAI, DeepSeek, Qwen) for content summarization , support for inserting video screenshots and timestamp jump links. The project is hosted on GitHub under the MIT license, and is available in a Docker deployment and a Windows packaged version for students, creators, and researchers to organize materials for study or work. The official online experience is deployed on Cloudflare Pages, which may be slow to access due to network conditions.

BiliNote: AI tool that automatically generates Markdown notes from videos-1


 

Function List

  • Automatically extracts content from Beep and YouTube video links to generate Markdown notes.
  • Native audio transcription using Fast-Whisper models with privacy support.
  • Support OpenAI, DeepSeek, Qwen and other big models to summarize the core content of the video.
  • Optional insertion of video keyframe screenshots to enhance note visualization.
  • Generate timestamped notes with support for jumping to the corresponding point in time of the original video.
  • Provide task logging function, you can look back at the history of notes to generate records.
  • Supports Docker one-click deployment to simplify local or cloud installations.
  • A packaged version (exe file) is available for Windows and does not require complex configuration to use.
  • There are plans to support more video platforms such as Jitterbug and Shutterbug.

 

Using Help

Installation and Deployment

BiliNote offers three ways to use it: manual deployment, Docker deployment and Windows packaged version. Below are the detailed steps:

manual deployment

  1. Cloning Project Code
    Run the following command to get the source code:

    git clone https://github.com/JefferyHcool/BiliNote.git
    cd BiliNote
    mv .env.example .env
    
  2. Install FFmpeg
    BiliNote relies on FFmpeg for audio processing and must be installed:

    • Mac: Run brew install ffmpeg
    • Ubuntu/Debian: Run sudo apt install ffmpeg
    • Windows (computer): Download and install FFmpeg from the official FFmpeg website, and make sure that the path to the FFmpeg executable is added to the system environment variable PATH.
  3. Configuring the backend
    Go to the backend directory, install the dependencies and start the service:

    cd backend
    pip install -r requirements.txt
    python main.py
    

    compiler .env file to configure the API key and port, for example:

    API_BASE_URL=http://localhost:8000
    OUT_DIR=note_results
    IMAGE_BASE_URL=/static/screenshots
    MODEL_PROVIDER=openai
    OPENAI_API_KEY=sk-xxxxxx
    DEEP_SEEK_API_KEY=xxx
    QWEN_API_KEY=xxx
    
  4. Configuring the Front End
    Go to the front-end directory, install the dependencies and start the service:

    cd BiliNote_frontend
    pnpm install
    pnpm dev
    

    interviews http://localhost:5173 View the front-end interface.

  5. Optimized audio transcription (optional)
    If you are using an NVIDIA GPU, you can enable the CUDA-accelerated version of Fast-Whisper, see Fast-Whisper Project Configuration.

Docker Deployment

  1. Ensure that Docker and Docker Compose are installed
    Refer to the Docker website to install.
  2. Clone and configure the project
    git clone https://github.com/JefferyHcool/BiliNote.git
    cd BiliNote
    mv .env.example .env
    
  3. Starting services
    Run the following command to build and start the container:

    docker compose up --build
    

    The default port is the front end http://localhost:${FRONTEND_PORT} and back-end http://localhost:${BACKEND_PORT}The following is an example of a program that can be used in the .env customized in the file.

Windows Packaged Version

  1. Download exe file
    Visit the GitHub Release page to download the Windows package (exe file).
  2. running program
    Double click the exe file to start, no need to install FFmpeg or configure environment variables manually. The first time you run it, you need to enter the API key.
  3. Configuring API Keys
    Enter the API key for OpenAI, DeepSeek or Qwen in the program interface, save it and use it.

Procedure for use

  1. Visit BiliNote
    • Local deployment: open a browser and visit http://localhost:5173The
    • Online experience: visit https://www.bilinote.app(possibly due to slow loading of Cloudflare Pages).
    • Windows packaged version: Double click on the exe file to start the program.
  2. Enter video link
    Enter a link to a publicly available Bleep or YouTube video in the interface, for example https://www.bilibili.com/video/xxxClick "Submit" to begin the process. Click "Submit" to begin the process.
  3. Configuration Generation Options
    • AI model: Choose OpenAI, DeepSeek, or Qwen for content summarization.
    • Screenshot Insertion: Check whether to automatically insert video screenshots.
    • jump link: Choose whether or not to generate a jump link with a timestamp.
    • note-taking style: Choose from Academic Style, Spoken Style, or Focused Extraction Mode (some styles are subject to future update support).
  4. Generate notes
    After clicking "Generate", BiliNote downloads the video audio, transcribes it to text using Fast-Whisper, and generates Markdown notes using the selected macromodel. The generation time depends on the video length and hardware performance.
  5. Viewing and exporting notes
    • Notes are displayed in Markdown format, with headings, paragraphs, timestamps, and screenshots (if enabled).
    • Click on the timestamp to jump to the corresponding point in time of the original video.
    • Export to Markdown files is supported, with future plans to support PDF, Word and Notion formats.
    • Historical notes can be viewed on the Task History screen, with support for viewing and editing.

Featured Function Operation

  • Native Audio Transcription: Fast-Whisper models run locally to protect data privacy. Supports CUDA acceleration for faster transcription.
  • Multi-model support: Switch between OpenAI, DeepSeek, or Qwen for different languages and scenarios (e.g., Qwen is better for Chinese videos).
  • Screenshot Insertion: Automatically intercepts video keyframes and inserts them into the corresponding positions of the notes to enhance readability.
  • Mission history: Each generated task is automatically saved for subsequent review or modification.
  • Windows Packaged Version: Provide an out-of-the-box experience for non-technical users and simplify the installation process.

caveat

  • Video links need to be publicly accessible, private videos may not be processed.
  • The content summarization feature needs to be configured with a valid API key (OpenAI, DeepSeek, or Qwen).
  • FFmpeg must be installed correctly (except for Windows packages).
  • The online experience may load slowly due to Cloudflare Pages limitations, so we recommend deploying locally or using the Windows packaged version.
  • Ensure a stable network to avoid audio downloads or API calls failing.

 

application scenario

  1. Students organize their online class notes
    Students can take Markdown notes from Beep or YouTube videos, extracting key points and time stamps for easy review and orientation.
  2. Content creators organize their material
    Creators can extract video scripts or key information to generate notes with screenshots for content curation or copywriting.
  3. Archiving of corporate training content
    Enterprises can turn training videos into structured notes for employees to review or archive, improving learning efficiency.
  4. Researchers organize academic lectures
    Researchers can turn academic conference videos into notes, extract core ideas and data, and build a knowledge base.
  5. Personal knowledge management
    Users can turn videos of interest (e.g., tutorials, podcasts) into notes and save them to their personal knowledge base for access at any time.

 

QA

  1. What video platforms does BiliNote support?
    Currently it supports BeiliBili and YouTube, and in the future it plans to support Jieyin, Shutterstock and other platforms.
  2. What is the difference between a packaged version of Windows and a local deployment?
    The Windows packaged version eliminates the need to manually install FFmpeg or configure the environment for non-technical users. Local deployment is more flexible, with support for custom configurations and GPU acceleration.
  3. How can I increase the speed of audio transcription?
    For a CUDA-accelerated version using an NVIDIA GPU device with Fast-Whisper enabled, refer to the Fast-Whisper project.
  4. Do I have to use a paid API key?
    The content summarization feature requires an API key for OpenAI, DeepSeek, or Qwen (fees may apply). Audio transcription can be run locally for free.
  5. Why is the online experience version loading slowly?
    The online version is deployed on Cloudflare Pages and is subject to network and server limitations. Local deployment or Windows packaged version is recommended.
May not be reproduced without permission:Chief AI Sharing Circle " BiliNote: The AI tool that automatically generates Markdown notes from videos
en_USEnglish