AI Personal Learning
and practical guidance
讯飞绘镜
Total 65 articles

Tags: ai text to speech Page 3

Resemble AI:人工智能语音合成平台|声音克隆|深度伪造音频检测-首席AI分享圈

Resemble AI: Artificial Intelligence Speech Synthesis Platform | Voice Cloning | Deep Fake Audio Detection

Comprehensive Introduction Resemble AI is an artificial intelligence speech synthesis platform designed for the enterprise. The platform provides cutting-edge AI voice generator technology and deep forged audio detection for future information security. Features include voice cloning, real-time deep fake audio detection, AI watermarking technology, rich emotion...

Ondoku:在线文本朗读工具|文本转语音|图片转语音朗读-首席AI分享圈

Ondoku: Online Text Reader|Text to Speech|Image to Speech Reader

Ondoku General Introduction Ondoku is an online text-to-speech software that allows users to enter text content into the text box provided by the website, and the software is able to convert the article into a voice readout according to the user's needs, and supports saving the voice as an MP3 format file. This service is suitable both for instant listening and for generating audio...

XAudioPro:专业在线音频剪辑工具|有声书制作|文字转语音|伴奏分离-首席AI分享圈

XAudioPro: Professional Online Audio Editing Tool|Audiobook Maker|Text to Speech|Accompaniment Separation

General Introduction XAudioPro is an advanced online audio real-time editing and transcoding tool that is both professional and portable. It supports professional audio editing functions such as cutting, cropping, copying, deleting, restoring, and amplitude gain control. It also provides denoising services such as spectral subtraction noise reduction, low-pass spectral reduction...

Hume AI:赋予AI情感识别能力|从声音和表情识别情感状态|生成具有情感状态的语音-首席AI分享圈

Hume AI: Empowering AI with Emotion Recognition | Recognizing Emotional States from Sounds and Expressions | Generating Speech with Emotional States

General Introduction Hume AI is an AI company focused on emotional intelligence, developing multimodal AI technologies that understand and respond to human emotions. Its flagship product, the Empathic Voice Interface (EVI), recognizes and responds to user emotions in multiple forms, including speech, facial expressions, and language, to enhance...

魔音工坊:专业配音与短视频解说创作平台|真人配音|克隆声音|一键成片-首席AI分享圈

Magic Voice Workshop: professional voice-over and short video narration creation platform | real person voice-over | clone voice | one-click into a film

Comprehensive Introduction Magic Voice Workshop is a one-stop short video and AI dubbing platform with information on software dubbing, real-life dubbing, sound libraries, cloning services and more. The platform integrates audio editing, AI copy generation, video editing and collaboration tools for audio-related services and content creation. Users experience the audio editor...

录咖:一站式音视频处理平台|视频生成|AI字幕|提取音频|语音转文字-首席AI分享圈

Record Cafe: One-stop Audio/Video Processing Platform|Video Generation|AI Subtitle|Audio Extraction|Speech to Text

Comprehensive Introduction Record Cafe is a one-stop audio/video processing platform that provides AI video dialog, AI subtitles and AI speech to text services. Features include recording screen, editing video, converting GIF/audio, etc., and supports cloud storage and sharing. The interface is intuitive and easy to use, and it also supports multi-screen recording and multi-language intelligent reading...

IMS Toucan:快速可控的多语言(支持7000+语言)文本转语音工具-首席AI分享圈

IMS Toucan: Fast and Controllable Multilingual (7000+ languages supported) Text-to-Speech Tool

General Introduction IMS Toucan is a state-of-the-art text-to-speech (TTS) toolkit developed by the Institute for Natural Language Processing (IMS) at the University of Stuttgart, Germany. Supporting more than 7000 languages, the toolkit is fast, controllable and has low computational resource requirements.IMS Toucan is designed for research, teaching...

ChatTTS:模仿真人说话声音的语音生成模型(ChatTTS一键加速包)-首席AI分享圈

ChatTTS: a speech generation model that mimics the voice of a real person speaking (ChatTTS one-click acceleration package)

General Introduction ChatTTS is a generative speech model designed for conversational scenarios. It generates natural and expressive speech, supports multiple languages and multiple speakers, and is suitable for interactive conversations. The model goes beyond large by predicting and controlling fine-grained prosodic features such as laughter, pauses, and interjections...

en_USEnglish