AI Speech to Text

Total 56 articles posts

Sorting

Abogen: a tool for converting multiple text formats to audiobooks

General Introduction Abogen is an open source tool designed to quickly convert ePub, PDF or plain text files to high quality audio. It uses the Kokoro-82M model to generate natural and smooth speech, and supports synchronized subtitle generation, which is suitable for producing audiobooks...

11mos ago

077.8K

Kimi-Audio: Open Source Audio Processing and Dialogue Base Modeling

Comprehensive Introduction Kimi-Audio is an open source audio base model developed by Moonshot AI that focuses on audio understanding, generation and dialog. It supports a wide range of audio processing tasks such as speech recognition, audio Q&A and speech emotion recognition. The model has been tested over 130...

Latest AI Resources # AI Java Open Source Projecct # AI text-to-speech # AI Speech to Text

11mos ago

0124.4K

On Device AI: AI Voice Transcription and Chat Tool for iPhone Native Running

Comprehensive Introduction On-Device AI is an AI app that runs completely offline and is designed for Apple devices with support for iOS, macOS, and visionOS.It provides local large-scale language model (LLM) running, real-time speech transcription, document analysis, and more, without the need to link...

Latest AI Resources # AI Localized Chat Application # AI Speech to Text

11mos ago

075.7K

Vexa: a real-time meeting transcription and intelligent knowledge extraction tool

Comprehensive Introduction Vexa is an open source real-time meeting transcription and knowledge management platform designed to provide efficient meeting recording and intelligent knowledge extraction services for enterprises and individuals. It automatically joins platforms such as Google Meet, Zoom, etc. through API-driven meeting robots...

Latest AI Resources # AI Java Open Source Projecct # AI Text and Audio/Video Summarization Tool # AI Speech to Text

12mos ago

0104.6K

Open source tool for real-time speech to text

General Introduction realtime-transcription-fastrtc is an open source project focused on converting speech to text in real time. It uses FastRTC technology to process low-latency audio streams , combined with the local Whisper model to achieve efficient ...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

067.8K

Transkriptor: the AI-smart transcription tool that turns audio and video into text

General Introduction Transkriptor is an AI-driven transcription tool that focuses on quickly converting audio and video to text. It supports over 100 languages with an accuracy rate of up to 99% and is suitable for a wide range of scenarios such as meetings, interviews, classroom notes, and more. Users can upload files, direct...

Latest AI Resources # AI Text and Audio/Video Summarization Tool # AI Translation # AI Speech to Text

1yrs ago

078.8K

Otter.ai: Intelligent meeting assistant with real-time voice transcription tool

General Description Otter.ai is an AI-powered meeting management and voice transcription tool with core functionality to convert voice to text in real-time and automatically generate meeting notes, summaries and action items. It is intelligently supported by an AI Meeting Agent that automatically adds...

Latest AI Resources # AI Text and Audio/Video Summarization Tool # AI Speech to Text

1yrs ago

068.5K

TurboScribe: Online tool to quickly convert audio and video to text

General Description TurboScribe is an AI-based transcription tool that focuses on quickly converting audio and video to text. It supports more than 98 languages with an accuracy rate of 99.8% for users who need to process voice content efficiently. Users can upload files, generate text notation...

Latest AI Resources # AI Speech to Text

1yrs ago

087.6K

Aqua Voice: Cross-Application Speech Input to Generate Accurate Text

General Introduction Aqua Voice is an intelligent speech-based text generation tool focused on quickly converting user speech into formatted text. It was created in 2023 by Finnian Brown and Jack McIntire and is based in the United States...

Latest AI Resources # AI Speech to Text

1yrs ago

0180.2K

Dolphin: Asian Language Recognition and Speech-to-Text Modeling for Asian Languages

Comprehensive Introduction Dolphin is an open source model developed by DataoceanAI and Tsinghua University, focusing on speech recognition and language recognition for Asian languages. It supports 40 languages in East Asia, South Asia, Southeast Asia, and the Middle East, as well as 22 Chinese dialects...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

062.1K

TwinMind: free offline voice to text transcription app

General Introduction TwinMind is a smart tool developed by ThirdEar AI, Inc. that "helps you remember everything". TwinMind is a smart tool developed by ThirdEar AI, Inc. that "remembers everything for you". It can record conversations, meetings, or lectures in real time and convert them to text in more than 100 languages, even with your cell phone in your pocket...

Latest AI Resources # AI Text and Audio/Video Summarization Tool # AI Speech to Text

1yrs ago

060K

Wispr Flow: Use your voice to quickly enter text in any application

General Description Wispr Flow is a tool for entering text by voice, helping users to write quickly on their computers. It's a "3x faster than typing" experience that allows users to enter text into any application such as Word, Slack or G...

Latest AI Resources # AI Speech to Text

10mos ago

0127.4K

Meeting: local real-time transcription and generation of meeting minutes of the open source client

General Introduction Meeting Minutes (a.k.a. Meetily) is a free and open source AI meeting assistant tool developed by Zackriya Solutions that focuses on capturing meeting audio in real-time, generating transcribed text and automatically extracting meeting...

Latest AI Resources # AI Java Open Source Projecct # AI Text and Audio/Video Summarization Tool # AI Speech to Text

1yrs ago

0106.8K

Local-NotebookLM: local PDF to generate voice podcasts of open source tools

Comprehensive introduction Local-NotebookLM is an open source project that aims to provide locally run intelligent document processing and content generation tools. It is inspired by Google NotebookLM , focusing on helping users to PDF and other documents into a variety of ...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

088.3K

AssemblyAI: High-precision Speech-to-Text and Audio Intelligence Analysis Platform

General Introduction AssemblyAI is a platform focused on speech AI technology, providing developers and enterprises with efficient speech-to-text and audio analysis tools. Its core highlight is the Universal family of models, especially the newly released Universal-2...

Latest AI Resources # AI Open Services # AI Speech to Text

1yrs ago

070.5K

FireRedASR: An Open Source Model for Multilingual High-Precision Speech Recognition

General Introduction FireRedASR is a speech recognition model developed and open-sourced by the Little Red Book FireRed team, focusing on providing high-precision, multi-language-supported automatic speech recognition (ASR) solutions. The project is hosted on GitHub for developers and researchers, and offers...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

0100.8K

WhisperChain: real-time speech-to-text and optimization of spoken words

General Introduction WhisperChain is an AI-based open source project hosted on GitHub and led by developer Chris Choy. It is mainly used to convert speech into text and automatically optimize the expression through AI technology to remove redundancy...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

057.3K

LLPlayer: Video player that generates real-time subtitles with bilingual translation

General Introduction LLPlayer is an open source media player for language learners, hosted on GitHub and created by developer umlx5h. It integrates a variety of useful features such as bilingual subtitle display, AI auto-generated subtitles, real-time translation and word search...

Latest AI Resources # AI Java Open Source Projecct # AI Translation # AI Speech to Text

10mos ago

0214.8K

CapsWriter-Offline: Speech Input and Subtitle Transcription Tool for the PC

General Introduction CapsWriter-Offline is a voice input and subtitle transcription tool for PC, hosted on GitHub and built by developer HaujetZhao. It runs completely offline and does not require an Internet connection to realize speech-to-text and audio-visual...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

067.5K

Whisper Input: a free and high-speed voice-to-text transcription service using Groq

General Description Whisper Input is an open source voice transcription tool that allows users to start recording voice by pressing the Option button and end the recording by lifting the button. The tool calls Groq Whisper Large V3 Turbo ...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

073.8K

LiberSonora: Audiobook Subtitle Extraction and Multilingual Translation, Audiobook Transcription into Multiple Languages

General Introduction LiberSonora, which means "free sound", is a powerful AI-enabled open source audiobook toolset. The toolset supports intelligent subtitle extraction, AI title generation, multi-language translation, etc., and is capable of batch offline processing under GPU acceleration.LiberSo...

Latest AI Resources # AI Java Open Source Projecct # AI Translation # AI Speech to Text

1yrs ago

054.5K

AudioNotes: Quickly Extract Audio and Video Content and Generate Structured Notes

Comprehensive Introduction AudioNotes is an audio/video to structured notes system built on FunASR and Qwen2. It can quickly extract audio/video content and call the big model to organize it and generate a structured Markdown notes, which is convenient for...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

057.7K

Orate: A Unified API for Integrating Well-Known Speech Generation, Speech Transcription and Voice Change Models

Comprehensive Introduction Orate is an AI toolkit focused on speech generation and transcription. It provides a unified API that seamlessly integrates with leading AI providers such as OpenAI, ElevenLabs, and AssemblyAI to help users create forced...

Latest AI Resources # AI Java Open Source Projecct # AI text-to-speech # AI Speech to Text

1yrs ago

064.7K

PengChengStarling：对比Whisper-Large v3更小、更快的多语言语音转文字工具

PengChengStarling: Smaller and Faster Multilingual Speech-to-Text Tool than Whisper-Large v3

Comprehensive Introduction PengChengStarling (PengCheng Labs) is a multilingual Automatic Speech Recognition (ASR) tool capable of converting speech in different languages into corresponding text. This toolkit is developed based on the icefall project and provides a complete speech recognition process...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

060.9K

RealtimeSTT：实时语音转文字工具，基于Whisper实现低延迟流式语音识别

RealtimeSTT: Real-time Speech-to-Text Tool for Low-Latency Streaming Speech Recognition Based on Whisper

General Introduction RealtimeSTT is an efficient, low-latency real-time speech-to-text library with advanced speech activity detection and wake word activation. It was developed by Kolja Beigel to support applications that require fast and accurate speech-to-text...

AI News # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

089.7K

Sherpa-ONNX: Offline Speech Recognition and Synthesis with ONNXRuntime

General Introduction sherpa-onnx is an open source project developed by the Next-gen Kaldi team to provide efficient offline speech recognition and speech synthesis solutions. It supports multiple platforms including Android, iOS, Raspber...

Latest AI Resources # AI Java Open Source Projecct # AI text-to-speech # AI Speech to Text

1yrs ago

0286.5K

Acoust: Online AI Speech Generation and Text-to-Speech (TTS) Services Platform

General Introduction Acoust is an online AI speech generation and text-to-speech (TTS) service platform that utilizes the latest AI technology to generate realistic speech. The platform also provides powerful video editing tools that allow users to complete video production without the need to use multiple software.Acou...

Latest AI Resources # AI text-to-speech # AI Speech to Text

1yrs ago

054.3K

Notta: AI meeting recording and audio transcription tool to automatically transcribe meetings, interviews or recordings

General Description Notta is a powerful AI meeting recording and audio transcription tool designed to help users automatically convert meetings, interviews or audio recordings into searchable text. With Notta, users can easily transcribe, edit, summarize and collaborate to boost productivity.Notta supports...

Latest AI Resources # AI Text and Audio/Video Summarization Tool # AI Speech to Text

1yrs ago

078.3K

AI no jimaku gumi: Automatic generation and translation of multilingual subtitles for videos with the help of AI

General Introduction AI no jimaku gumi (AI no subtitle group) is a powerful command line video subtitle processing tool focused on automating video subtitle extraction, transcription and translation functions. The tool integrates advanced AI technologies, including Whisper speech...

Latest AI Resources # AI Java Open Source Projecct # AI Translation # AI Speech to Text

1yrs ago

064.1K

FunClip: Intelligent editing of video content into short clips, easy to realize accurate video clip extraction/cropping

Comprehensive Introduction FunClip is a fully open source localized automatic video editing tool developed by TONGYI Speech Lab of Alibaba Dharma Institute. The tool integrates the industrial-grade Paraformer-Large speech recognition model, which can accurately recognize the speech in the video...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text # AI audio/video editor

1yrs ago

0112.5K

BetterWhisperX: Automated speech recognition separated from the speaker, providing highly accurate word-level timestamps

Comprehensive Introduction BetterWhisperX is an optimized version of the WhisperX project focused on providing efficient and accurate automatic speech recognition (ASR) services. An improved offshoot of WhisperX, the project was developed by Federico ...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

076.7K

Freed: AI medical transcription assistant that accurately transcribes doctor-patient conversations and reduces visit documentation paperwork

General Description Freed is an AI medical transcription assistant designed for healthcare professionals. It helps doctors and other healthcare practitioners automate the recording of patient visits, reduce paperwork, and increase work efficiency through advanced AI technology.Freed's AI transcription...

Latest AI Resources # AI Speech to Text

1yrs ago

057.9K

Voicenotes: AI voice notes, record and transcribe voice, intelligently manage meeting content

General Introduction Voicenotes is a smart voice notes app designed to help users easily record and manage voice notes and meetings. The app supports voice transcription in more than 100 languages, users just need to say the idea, Voicenotes can automatically transcribe it into text...

Latest AI Resources # AI Notes # AI Speech to Text

1yrs ago

066.3K

Voice-Pro：开源多功能视频翻译工具，语音转录并翻译为多语言，Windows一键安装

Voice-Pro: open source multifunctional video translation tool, voice transcription and translation into multiple languages, Windows one-click installation

General Introduction Voice-Pro is a multifunctional tool based on Gradio WebUI that supports speech-to-text, text-to-speech, real-time translation, YouTube video downloads and human voice separation. It integrates Whisper, Faster-Wh...

Latest AI Resources # AI Java Open Source Projecct # AI Translation # AI Speech to Text

1yrs ago

072.5K

Zamzar：多功能在线文件格式转换工具，视频转换|音频转换|图片转换|文档转换

Zamzar: Multi-functional online file format conversion tool, video conversion | audio conversion | image conversion | document conversion

General Introduction Zamzar is a powerful online file conversion tool that supports over 1200 file formats. Whether it's documents, pictures, videos, audios or eBooks, Zamzar can do it quickly and efficiently. Users don't need to download any software...

Latest AI Resources # AI Open Services # AI Speech to Text

1yrs ago

079.3K

AI Hear: Real-Time Speech Transcription and Translation Software for Native Offline Operation

General Description If you're using a MacBook, try AI Hear: you can record, real-time local speech to text, and translate, and eventually export subtitles. You can use it to assist you in listening to cross-country meetings and English audiobooks. AI Hear is a locally running software that provides one-click real-time...

Latest AI Resources # AI Translation # AI Speech to Text

1yrs ago

063.6K

SoniTranslate：开源视频翻译配音解决方案，多人配音、调整语速与模仿原声

SoniTranslate: open source video translation and dubbing solution, multi-person dubbing, adjust the speed of speech and mimic the original sound

General Description SoniTranslate is a powerful and user-friendly video multilingual dubbing tool designed to provide a solution for video translation and synchronized audio. It uses advanced speech recognition and machine translation technologies to translate video content into multiple languages and keep the audio synchronized. The program ...

Latest AI Resources # AI text-to-speech # AI Translation # AI Speech to Text

1yrs ago

0138.9K

FunASR: Open Source Speech Recognition Toolkit, Speaker Separation / Multi-Person Conversation Speech Recognition

Comprehensive Introduction FunASR is an open source speech recognition toolkit developed by Alibaba's Dharma Institute to bridge academic research and industrial applications. It supports a wide range of speech recognition features, including speech recognition (ASR), voice endpoint detection (VAD), punctuation recovery, language modeling, speaking...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

2yrs ago

0158.3K

AsrTools: speech-to-subtitle tool, lightweight client with built-in interfaces to Cutscene, Racer, and Must-Cut

Comprehensive Introduction AsrTools is an intelligent speech-to-text tool with built-in interfaces from big players such as Cutscene, Racer, Must Cut, etc. It does not require GPU or cumbersome configuration, and supports efficient multi-threaded batch processing. It is based on PyQt5 development, beautiful and user-friendly interface, able to output SRT and TXT format words...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

2yrs ago

077.3K

Happy Scribe: Audio Transcription and Video Subtitling Platform | Free Video Subtitle Editing Software

Happy Scribe General Description Happy Scribe provides automated and manual audio transcription services to convert audio to text with high accuracy and support for multiple languages and formats. It includes an interactive editor, collaboration tools, multiple export formats, machine translation, and other features...

Latest AI Resources # AI Speech to Text # AI audio/video editor

2yrs ago

069.2K

Whisper GPGPU：运行在Windows的OpenAI Whisper|Whisperdesktop

Whisper GPGPU: OpenAI Whisper running on Windows|Whisperdesktop

Comprehensive Introduction Whisper is a GitHub open source project developed by Const-me, focusing on high performance inference of OpenAI's Whisper Automatic Speech Recognition (ASR) model using GPGPU. This project is based on the MPL-2.0 license...

Latest AI Resources # AI Speech to Text

2yrs ago

0106.2K

Buzz: open source offline audio transcription translation tool | IOS voice transcription

Buzz General Introduction Buzz is an open source project created by chidiwilliams that enables offline transcription and translation of audio on personal computers. The project relies on OpenAI's Whisper technology, which allows users to not rely on an Internet connection for audio text...

Latest AI Resources # AI Speech to Text

2yrs ago

0143.1K

Deepgram: service API for high-precision speech recognition and synthesis solutions

General Description Deepgram is a company focused on speech recognition and natural language processing technologies, providing powerful Speech-to-Text and Text-to-Speech APIs.The platform utilizes advanced artificial intelligence...

Latest AI Resources # AI Open Services # AI Speech to Text

1yrs ago

075.2K

Murf AI: Voice Changer|Speech to Text|Text to Speech|Audio Editor

General Introduction Murf AI is a powerful online artificial intelligence voice generation tool that converts text into near real human speech. It offers up to 120+ AI voice options, supports 20+ languages, and is suitable for a variety of occasions, such as podcasts, videos, professional presentations, etc.Mu...

Latest AI Resources # AI text-to-speech # AI Speech to Text

2yrs ago

057K

VideoLingo：视频转录单词级时间轴字幕，视频字幕翻译和本地化配音开源工具

VideoLingo: video transcription word-level timeline subtitles, video subtitle translation and localized dubbing open source tools

General Description VideoLingo is a one-stop video translation and localization dubbing tool designed to generate Netflix-grade, high-quality subtitles, eliminating raw machine translation and multi-line subtitles, and adding high-quality voiceovers that enable global knowledge to be shared across language barriers. By...

Latest AI Resources # AI Side Hustle Money Making Programs # AI Translation # AI Speech to Text

1yrs ago

065.4K

ALog: portable AI voice diary app with speech-to-text support.

General Introduction ALog is an AI-based voice diary application designed to help users record their daily lives by voice. The project is developed by duxins and open-sourced on GitHub. Users can record their diary through voice input, and the app will automatically convert the voice into text...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

061K

Record Cafe: One-stop Audio/Video Processing Platform|Video Generation|AI Subtitle|Audio Extraction|Speech to Text

Comprehensive Introduction Record Cafe is a one-stop audio/video processing platform that provides AI video dialog, AI subtitles and AI speech to text services. Functions include recording screen, editing video, converting GIF/audio, etc., and supports cloud storage and sharing. The interface is intuitive and easy to use, and it also supports multi-screen recording and multi-language smart...

Latest AI Resources # AI text to video # AI text-to-speech # AI Speech to Text

1yrs ago

066.6K

CrisperWhisper: Accurate Verbatim Speech Transcription Tool

General Description CrisperWhisper is an advanced speech recognition tool based on OpenAI Whisper that focuses on fast, accurate and word-by-word speech transcription. It provides accurate word-level timestamps, even in the case of speech fills and pauses...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

071K

Babelfish.ai: Browser-Run Real-Time Speech Transcription and Translation Application

General Introduction Babelfish.ai is a real-time transcription and translation application built on Huggingface Transformer.js and Supabase Realtime. The application can load large models in the browser and...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

2yrs ago

052.7K

FreeTTS: Free Online Text-to-Speech Tool|Audio Enhancement|Audio Clips

FreeTTS General Description FreeTTS is a free online text-to-speech tool that allows users to convert text to natural sounding voice files. Supporting multiple languages and sound options, users can convert text to MP3, WAV, OGG and ACC formats...

Latest AI Resources # AI text-to-speech # AI Speech to Text # AI audio/video editor

2yrs ago

068.5K

Easy Voice Toolkit: AI Voice Toolkit for Local Deployment

Comprehensive Introduction Easy-Voice-Toolkit is a multifunctional toolkit based on the Open Source Speech Project, providing a variety of automated audio tools for speech recognition, speech transcription, speech conversion, dataset creation and model training. Users can selectively use these tools as needed...

Latest AI Resources # AI Java Open Source Projecct # AI text-to-speech # AI voice cloning

2yrs ago

063.5K

DupDub: AI-powered Video Editor|Dubbing|Video Translation|Photo Digitizer

General Description Dupdub is a side-heavy podcast and video presentation creation platform that offers a range of AI tools to support users' creativity. Features cover text to video creation, offering AI voice and video dubbing services, as well as video editing, transcription and subtitling. Dupdub is also ...

Latest AI Resources # AI Digital Man # AI text-to-speech # AI Speech to Text

2yrs ago

055.1K

Tongyi Listening and Understanding: Ali Tongyi Audio and Video Content Transcription AI Assistant

Comprehensive Introduction Tongyi Listening and Understanding is a work-study AI assistant launched by Aliyun, focusing on transcribing and analyzing audio and video content. It relies on AliCloud's powerful AI models to transcribe audio and video content into text in real time, and provides translation, summarization, positioning and other functions. Tongyi Listening Woo supports multiple languages and scenarios...

Latest AI Resources # AI Text and Audio/Video Summarization Tool # AI Speech to Text

2yrs ago

067.6K

Insanely Fast Whisper: fast and efficient transcription of speech to text open source project

Comprehensive Introduction insanely-fast-whisper is a combination of OpenAI's Whisper model and various optimization techniques (e.g. Transformers, Optimum, Flash Attention) for audio trans...

Latest AI Resources # AI Java Open Source Projecct # AI Speech to Text

1yrs ago

069.2K

Memo AI: Native Client for Video to Subtitle, Converting Multilingual Subtitles

General Description MemoAI is a powerful video translation tool specialized in converting video and audio files to text, subtitles and notes. Whether it's a YouTube video, a podcast or a local file, MemoAI can handle it with ease. It supports more than 90 languages such as Chinese, English, Japanese...

Latest AI Resources # AI text-to-speech # AI Speech to Text # AI audio/video editor

1yrs ago

065.8K

pyvideotrans: Video Translation Dubbing Tool

pyVideoTrans General Introduction pyvideotrans is a video translation dubbing tool. Users are able to translate video content from one language to another, and add appropriate dubbing and subtitles to the video. It is based on openai-whisper offline...

Latest AI Resources # AI text-to-speech # AI Speech to Text # AI audio/video editor

2yrs ago

082.8K

No more