🚀 Invitation to Experience: China's First AI IDE Intelligent Programming Software Trae Chinese version downloadThe DeepSeek-R1 and Doubao-pro are available for unlimited use!

Total 53 articles

Tags: ai speech to text Page 2

Orate: A Unified API for Integrating Well-Known Speech Generation, Speech Transcription and Voice Change Models

General Description Orate is an AI toolkit focused on speech generation and transcription. It provides a unified API that seamlessly integrates with leading AI providers such as OpenAI, ElevenLabs, and AssemblyAI to help users create realistic, human-like speech and transcribe audio into text.Ora...

PengChengStarling：对比Whisper-Large v3更小、更快的多语言语音转文字工具-首席AI分享圈

PengChengStarling: Smaller and Faster Multilingual Speech-to-Text Tool than Whisper-Large v3

Comprehensive Introduction PengChengStarling (PengCheng Labs) is a multilingual Automatic Speech Recognition (ASR) tool capable of converting speech in different languages into corresponding text. This toolkit is developed based on the icefall project and provides a complete speech recognition process, including data processing, model training,...

2025-01-30AI tools AI open source project AI Speech to Text

Trae Chinese Version First Invitation to Download: Unlimited use of DeepSeek-R1 after registration!

Enable Builder Smart Programming Mode, unlimited use of DeepSeek-R1 and DeepSeek-V3, smoother experience than the overseas version. Just enter the Chinese commands, even a novice programmer can write his own apps with zero threshold.

2025-04-25

RealtimeSTT：实时语音转文字工具，基于Whisper实现低延迟流式语音识别-首席AI分享圈

RealtimeSTT: Real-time Speech-to-Text Tool for Low-Latency Streaming Speech Recognition Based on Whisper

General Introduction RealtimeSTT is an efficient, low-latency real-time speech-to-text library with advanced speech activity detection and wake word activation. It was developed by Kolja Beigel to support applications that require fast and accurate speech-to-text conversion. Whether you are a voice assistant or need to fin...

2025-01-18AI News AI open source project AI Speech to Text

Sherpa-ONNX：使用ONNXRuntime实现离线语音识别和合成-首席AI分享圈

Sherpa-ONNX: Offline Speech Recognition and Synthesis with ONNXRuntime

General Introduction sherpa-onnx is an open source project developed by the Next-gen Kaldi team to provide efficient offline speech recognition and speech synthesis solutions. It supports a variety of platforms , including Android, iOS, Raspberry Pi , etc., can be in the absence of network connectivity in real-time ...

2025-01-16AI tools AI open source project AI Text-to-Speech AI Speech to Text

Acoust: Online AI Speech Generation and Text-to-Speech (TTS) Services Platform

Acoust is an online AI speech generation and text-to-speech (TTS) service platform that utilizes the latest AI technology to generate realistic speech. The platform also provides powerful video editing tools that allow users to create videos without having to use multiple software programs.Acoust supports more than 30 languages...

2025-01-10AI tools AI Text-to-Speech AI Speech to Text

Notta：AI会议记录与音频转录工具，自动转录会议、采访或录音-首席AI分享圈

Notta: AI meeting recording and audio transcription tool to automatically transcribe meetings, interviews or recordings

General Introduction Notta is a powerful AI meeting recording and audio transcription tool designed to help users automatically convert meetings, interviews or audio recordings into searchable text. With Notta, users can easily transcribe, edit, summarize and collaborate to boost productivity.Notta supports 58 languages for transcription...

2025-01-09AI tools AI text and audio/video summarization tool AI Speech to Text

AI no jimaku gumi: Automatic generation and translation of multilingual subtitles for videos with the help of AI

Comprehensive Introduction AI no jimaku gumi (AI no subtitle group) is a powerful command-line video subtitle processing tool focused on enabling automated video subtitle extraction, transcription, and translation functions. The tool integrates advanced AI technologies, including the Whisper speech recognition model and a variety of translation backends (such as Dee...

2025-01-06AI tools AI open source project AI translation AI Speech to Text

FunClip：智能剪辑视频内容为短片，轻松实现精准视频片段提取/裁剪-首席AI分享圈

FunClip: Intelligent editing of video content into short clips, easy to realize accurate video clip extraction/cropping

Comprehensive Introduction FunClip is a fully open source localized automatic video editing tool developed by TONGYI Speech Lab of Alibaba Dharma Institute. The tool integrates the industrial-grade Paraformer-Large speech recognition model, which can accurately recognize the speech content in the video and convert it to text. Special Features...

2025-01-03AI tools AI open source project AI Speech to Text AI audio and video editing

BetterWhisperX：自动语音识别与说话人分离，提供高精度单词级时间戳-首席AI分享圈

BetterWhisperX: Automated speech recognition separated from the speaker, providing highly accurate word-level timestamps

Comprehensive Introduction BetterWhisperX is an optimized version of the WhisperX-based project focused on providing efficient and accurate Automatic Speech Recognition (ASR) services. As an improved offshoot of WhisperX, the project is maintained by Federico Torrielli, who is committed to keeping the project continuously updated and improving performance...

2024-12-29AI tools AI open source project AI Speech to Text

Freed：AI医疗抄写助手，准确转录医生和患者对话，减少就诊记录文书工作-首席AI分享圈

Freed: AI medical transcription assistant that accurately transcribes doctor-patient conversations and reduces visit documentation paperwork

General Description Freed is an AI medical transcription assistant designed for healthcare professionals. It helps doctors and other healthcare practitioners automate the recording of patient visits, reduce paperwork, and increase productivity through advanced AI technology.Freed's AI transcription assistant is able to listen in real time,...

2024-12-27AI tools AI Speech to Text

Voicenotes：AI语音笔记，记录与转录语音，智能管理会议内容-首席AI分享圈

Voicenotes: AI voice notes, record and transcribe voice, intelligently manage meeting content

General Introduction Voicenotes is a smart voice notes app designed to help users easily record and manage voice notes and meetings. The app supports voice transcription in more than 100 languages. Users simply speak their thoughts and Voicenotes automatically transcribes them into text. Whether you are a student, professional...

2024-12-25AI tools AI Notes AI Speech to Text

Voice-Pro：开源多功能视频翻译工具，语音转录并翻译为多语言，Windows一键安装-首席AI分享圈

Voice-Pro: open source multifunctional video translation tool, voice transcription and translation into multiple languages, Windows one-click installation

General Introduction Voice-Pro is a multifunctional tool based on Gradio WebUI that supports speech-to-text, text-to-speech, real-time translation, YouTube video downloads and human voice separation. It integrates Whisper, Faster-Whisper and Whisper-Timestamped technologies to provide efficient...

2024-11-24AI tools AI open source project AI translation AI Speech to Text

Zamzar：多功能在线文件格式转换工具，视频转换|音频转换|图片转换|文档转换-首席AI分享圈

Zamzar: Multi-functional online file format conversion tool, video conversion | audio conversion | image conversion | document conversion

General Introduction Zamzar is a powerful online file conversion tool that supports over 1200 file formats. Whether it's documents, pictures, videos, audios or eBooks, Zamzar can do it quickly and efficiently. Users don't need to download any software, they just need to select the text...

2024-11-04AI tools AI Open Services AI Speech to Text

AI Hear: Real-Time Speech Transcription and Translation Software for Native Offline Operation

General Description If you're using a MacBook, try AI Hear: you can record, real-time local speech to text, and translate, and eventually export subtitles. You can use it to assist you in listening to cross-country conferences and English audiobooks. AI Hear is a locally-run software that provides one-click real-time translation and transcription, supports multiple...

2024-11-03AI tools AI translation AI Speech to Text

SoniTranslate：开源视频翻译配音解决方案，多人配音、调整语速与模仿原声-首席AI分享圈

SoniTranslate: open source video translation and dubbing solution, multi-person dubbing, adjust the speed of speech and mimic the original sound

General Description SoniTranslate is a powerful and user-friendly video multilingual dubbing tool designed to provide a solution for video translation and synchronized audio. It uses advanced speech recognition and machine translation technologies to translate video content into multiple languages and keep the audio synchronized. The program is based on Gradi...

2024-10-27AI tools AI Text-to-Speech AI translation AI Speech to Text

FunASR：开源语音识别工具包，说话人分离/ 多人对话语音识别-首席AI分享圈

FunASR: Open Source Speech Recognition Toolkit, Speaker Separation / Multi-Person Conversation Speech Recognition

Comprehensive Introduction FunASR is an open source speech recognition toolkit developed by Alibaba's Dharma Institute to bridge academic research and industrial applications. It supports a wide range of speech recognition features, including speech recognition (ASR), voice endpoint detection (VAD), punctuation recovery, language modeling, speaker verification, speak...

2024-10-16AI tools AI open source project AI Speech to Text

AsrTools：语音转字幕工具，内置剪映、快手、必剪接口的轻量客户端-首席AI分享圈

AsrTools: speech-to-subtitle tool, lightweight client with built-in interfaces to Cutscene, Racer, and Must-Cut

Comprehensive Introduction AsrTools is an intelligent speech-to-text tool with built-in interfaces from big players like Cutscene, Racer, Must Cut, etc. It doesn't require GPU or cumbersome configurations, and supports efficient multi-threaded batch processing. It is developed based on PyQt5, with a beautiful and user-friendly interface, capable of outputting subtitle files in SRT and TXT formats. The tool works by tuning ...

2024-10-14AI tools AI open source project AI Speech to Text

Happy Scribe：音频转录和视频字幕平台|免费视频字幕编辑软件-首席AI分享圈

Happy Scribe: Audio Transcription and Video Subtitling Platform | Free Video Subtitle Editing Software

Happy Scribe General Description Happy Scribe provides automated and manual audio transcription services to convert audio to text with high accuracy and support for multiple languages and formats. It includes an interactive editor, collaboration tools, multiple export formats, machine translation, and more. The platform is safe and reliable,...

2024-10-09AI tools AI Speech to Text AI audio and video editing

Whisper GPGPU：运行在Windows的OpenAI Whisper|Whisperdesktop-首席AI分享圈

Whisper GPGPU: OpenAI Whisper running on Windows|Whisperdesktop

General Introduction Whisper is a GitHub open-source project developed by Const-me, focusing on high-performance inference of OpenAI's Whisper automatic speech recognition (ASR) model using GPGPU. This project is released under the MPL-2.0 license, with the latest version 1.12 released on 7/22/2023. In lieu of ...

2024-10-09AI tools AI Speech to Text

preceding page
1
2
3
next page
Total 3 pages