Paper2Video - NUS open source project to automatically generate demo videos for academic papers

Latest AI Resources5mos agorelease AI Sharing Circle

32.3K 00

What is Paper2Video

Paper2Video is an open-source presentation video project for automatic generation of academic papers by Show Lab at National University of Singapore. Using the PaperTalker multi-intelligence framework, papers are transformed into complete presentation videos containing slides, subtitles, voiceover and speaker avatar. The framework consists of four modules: slide builder, subtitle builder, cursor builder, and speaker builder, which are responsible for slide generation, subtitle generation, cursor positioning, and speaker video generation, respectively. paper2video provides the first high-quality academic presentation video benchmark, which consists of 101 papers and their corresponding author videos, slides, etc. The project is based on the PaperTalker Multi-Intelligence Framework, which was developed by the NUS Show Lab.

Features of Paper2Video

Automated generation of demo videos: It can automatically generate complete presentation videos directly from academic papers, covering multiple aspects such as slides, subtitles, voice, cursor movement and speaker avatars, which greatly reduces the time and effort of manually producing presentation videos.
A framework for multi-intelligence collaborationThe PaperTalker multi-intelligence framework is used to assign different tasks to specialized modules (e.g., Slide Builder, Subtitle Builder, Cursor Builder, and Speaker Builder) for an efficient and flexible video generation process.
High-quality benchmarks and assessment indicatorsThe first high-quality benchmark dataset of academic presentation videos is provided, which contains 101 papers and their corresponding author presentation videos and slides, and Meta Similarity, PresentArena, PresentQuiz and IP Memory evaluation metrics are designed to comprehensively measure the quality of presentation videos.
Personalized Speaker Generation: Generate personalized speaker avatars and voices using author's portrait photos and voice samples to make videos more authentic and professional.
Parallelized Processing for Efficiency: By splitting the video generation task by slide and processing it in parallel, the generation time is significantly reduced and the overall efficiency is improved.
Easy to use and expand: Provide complete code implementation and detailed usage guide for researchers and developers to get started quickly, and can be customized and extended as needed.

Paper2Video's core strengths

efficient and time-saving: Automatically generate demo videos from academic papers, drastically reducing the time and effort of manually creating videos.
High quality output: The generated video achieves a high level of content accuracy, visualization and voice expression to enhance presentation quality.
Personalization: Generate personalized speaker avatars and voices based on author portraits and voice samples to enhance the authenticity and professionalism of the video.
Well-established assessment system: Provide specialized benchmarking data sets and evaluation metrics that can comprehensively measure the quality and effectiveness of generated videos.
Efficient parallel processing: Adopt parallelized processing technology to speed up video generation and improve work efficiency.

What is Paper2Video's official website?

Project website:: https://showlab.github.io/Paper2Video/
Github repository:: https://github.com/showlab/Paper2Video
arXiv Technical Paper:: https://arxiv.org/pdf/2510.05096

Who Paper2Video is for

Academic researchers: The ability to quickly turn research results into presentation videos for use in academic conferences, seminars or online courses.
Higher Education Teachers: The content of academic papers can be made into video courses to enrich teaching resources and enhance teaching effect.
Graduate and doctoral students: Help them prepare academic presentations and dissertation defense videos more efficiently.
research organization: For dissemination of research results and enhancement of the institution's academic impact.
academic communicator: Expand the dissemination of research by sharing scholarship through channels such as social media.
Technology Developer: Open source code and frameworks can be used for further development and customization to explore new application scenarios.