Buzz General Introduction Buzz is an open source project created by chidiwilliams that enables offline transcription and translation of audio on personal computers. The project relies on OpenAI's Whisper technology, which allows users to work on transcribing and translating audio files without relying on an Internet connection. Via GitHub, ...
General Description Deepgram is a company focused on speech recognition and natural language processing technologies, providing powerful Speech-to-Text and Text-to-Speech APIs.The platform utilizes advanced AI technology to help developers incorporate speech transcription and comprehension capabilities...
Enable Builder Smart Programming Mode, unlimited use of DeepSeek-R1 and DeepSeek-V3, smoother experience than the overseas version. Just enter the Chinese commands, even a novice programmer can write his own apps with zero threshold.
Comprehensive Introduction Murf AI is a powerful online artificial intelligence voice generation tool that converts text into near-life-like speech. It offers up to 120+ AI voice options, supports 20+ languages, and is suitable for a variety of occasions such as podcasts, videos, professional presentations, etc.Murf AI also features audio...
General Description VideoLingo is a one-stop video translation and localization dubbing tool designed to generate Netflix-grade, high-quality subtitles, eliminating raw machine translation and multi-line subtitles and adding high-quality voiceovers to enable global knowledge to be shared across language barriers. With intuitive Streamlit ...
General Introduction ALog is an AI-based voice diary application designed to help users record their daily lives by voice. The project is developed by duxins and open-sourced on GitHub. Users can record their diary through voice input, and the app will automatically convert the voice to text and analyze it intelligently...
Comprehensive Introduction Record Cafe is a one-stop audio/video processing platform that provides AI video dialog, AI subtitles and AI speech to text services. Features include recording screen, editing video, converting GIF/audio, etc., and supports cloud storage and sharing. The interface is intuitive and easy to use, and it also supports multi-screen recording and multi-language intelligent reading...
General Description CrisperWhisper is an advanced speech recognition tool based on OpenAI Whisper that focuses on fast, accurate and word-by-word speech transcription. It provides accurate word-level timestamps, even in the presence of speech fills and pauses.CrisperWhisper works by tuning...
General Introduction Babelfish.ai is a real-time transcription and translation application built on Huggingface Transformer.js and Supabase Realtime. The application can load large models in the browser and run them locally to realize real-time speech-to-text and translation functions. Users can use the simple...
FreeTTS General Description FreeTTS is a free online text-to-speech tool that allows users to convert text to natural sounding voice files. Supporting multiple languages and sound options, users can convert text to MP3, WAV, OGG and ACC formats.FreeTTS also provides voice transcription, sound...
Comprehensive Introduction Easy-Voice-Toolkit is a multifunctional toolkit based on the Open Source Speech Project that provides a wide range of automated audio tools for speech recognition, speech transcription, speech conversion, dataset creation and model training. Users can use these tools selectively or sequentially as needed...
General Description Dupdub is a side-heavy podcast and video presentation creation platform that offers a range of AI tools to support user creativity. Features cover text to video creation, offering AI voice and video dubbing services, as well as video editing, transcription and subtitling. Dupdub again out of the gate launched...
Comprehensive Introduction Tongyi Listening and Understanding is a work-study AI assistant launched by Aliyun, focusing on transcribing and analyzing audio and video content. It relies on AliCloud's powerful AI models to transcribe audio and video content into text in real time, and provides translation, summarization, positioning and other functions. Tongyi Listening Woo supports multiple languages and scenarios...
Comprehensive Introduction insanely-fast-whisper is an audio transcription tool that combines OpenAI's Whisper model with various optimization techniques (e.g. Transformers, Optimum, Flash Attention) to provide a command line interface (CLI) designed to transcribe large amounts of audio quickly and efficiently. It uses Whi...
General Description MemoAI is a powerful video translation tool specialized in converting video and audio files to text, subtitles and notes. Whether it's a YouTube video, a podcast or a local file, MemoAI can handle it with ease. It supports transcription and translation in more than 90 languages such as Chinese, English, Japanese, etc.MemoAI...
pyVideoTrans General Introduction pyvideotrans is a video translation dubbing tool. Users are able to translate video content from one language to another and add corresponding voiceovers and subtitles to the video. It is based on the openai-whisper offline model and supports a variety of translation and speech synthesis services, ex...