AI voice cloning

Total 41 articles posts
MegaTTS3:合成中英文语音的轻量模型

MegaTTS3: A Lightweight Model for Synthesizing Chinese and English Speech

Comprehensive Introduction MegaTTS3 is an open source speech synthesis tool developed by ByteDance in cooperation with Zhejiang University, focusing on generating high-quality Chinese and English speech. Its core model is only 0.45B parameters , lightweight and efficient , support for mixed Chinese and English speech generation and speech cloning . The project is hosted on ...
4mos ago
01.3K
Step-Audio:多模态语音交互框架,识别语音并使用克隆语音交流等功能

Step-Audio: a multimodal voice interaction framework that recognizes speech and communicates using cloned speech, among other features

Comprehensive Introduction Step-Audio is an open source intelligent speech interaction framework designed to provide out-of-the-box speech understanding and generation capabilities for production environments. The framework supports multi-language dialog (e.g., Chinese, English, Japanese), emotional speech (e.g., happy, sad), regional dialects (e.g., Cantonese, Szechuan ...
6mos ago
02K
Llasa 1~8B:高品质语音生成和克隆的开源文本转语音模型

Llasa 1~8B: an open source text-to-speech model for high quality speech generation and cloning

General Introduction Llasa-3B is an open source text-to-speech (TTS) model developed by the Audio Lab of the Hong Kong University of Science and Technology (HKUST Audio). The model is based on the Llama 3.2B architecture, which has been carefully tuned to provide high-quality speech generation that not only supports multiple...
6mos ago
01.7K
Fish Agent:端到端AI语音克隆助手,实时语音对话助理,Fish Speech衍生项目

Fish Agent: end-to-end AI voice cloning assistant, real-time voice conversation assistant, Fish Speech spin-off project

Comprehensive Introduction Fish Speech Derivative Project Fish Agent is a revolutionary end-to-end AI speech cloning system developed based on the V0.1 3B model architecture. As a fully end-to-end speech clone processing system, its most important feature is the use of innovative speechless...
7mos ago
02K
ViiTor AI:音频/视频多语言翻译合成与语音克隆服务

ViiTor AI: Audio/Video Multilingual Translation Synthesis and Speech Cloning Service

Comprehensive Introduction ViiTor AI is a powerful artificial intelligence platform focused on providing high-quality video translation, voice cloning, AI-generated avatar videos, and speech synthesis services. The platform supports multiple languages and is designed to help users easily realize multilingual content creation.ViiTo...
8mos ago
02.4K
趣丸千音:语音克隆并结合口型同步,一键翻译视频为多语言!

Funky Maru Chiyo: Voice cloning and combined with mouth synchronization to translate videos into multiple languages with one click!

Comprehensive Introduction Funmaru Thousand Voices is a multilingual AI voice synthesis platform that provides realistic and natural voice generation solutions. Users can easily convert text content into professional-grade audio and support the creation of exclusive AI voices (voice clones) from zero samples to meet personalized needs. The platform also provides video translation features to help...
8mos ago
01.7K
CosyVoice:阿里推出的3秒急速语音克隆开源项目,支持情感控制标签

CosyVoice: 3-second rush voice cloning open source project launched by Ali with support for emotionally controlled tags

Comprehensive Introduction CosyVoice is a multilingual large-scale speech generation model that provides full-stack capabilities from inference, training to deployment. Developed by the FunAudioLLM team, it aims to achieve high quality speech through advanced autoregressive transformers and ODE-based diffusion models...
6mos ago
03.2K
Coqui TTS(xTTS):文本到语音生成的深度学习工具包,支持多种语言和声音克隆功能

Coqui TTS (xTTS): Deep Learning Toolkit for Text-to-Speech Generation with Multiple Language Support and Voice Cloning Capabilities

Comprehensive Introduction Coqui TTS is an open source advanced text-to-speech (TTS) generation toolkit based on deep learning techniques. It has been battle-tested in both research and production environments, and provides a rich set of features and models that support text-to-speech conversion in multiple languages.Coqui TTS...
6mos ago
02K
自得语音:智能语音合成平台|语音克隆

Zide Speech: Intelligent Speech Synthesis Platform|Speech Cloning

Comprehensive Introduction Zide Voice is a voice synthesis platform that uses advanced AI technology. Users can simply upload a piece of voice, which can be supplemented with text to generate realistic and emotional voice clips. The platform is equipped with features such as quick character customization, cloud-based voice generation, and anthropomorphic voice synthesis. There is no need to download any software through...
10mos ago
01.7K
Resemble AI:人工智能语音合成平台|声音克隆|深度伪造音频检测

Resemble AI: Artificial Intelligence Speech Synthesis Platform | Voice Cloning | Deep Fake Audio Detection

Comprehensive Introduction Resemble AI is an artificial intelligence speech synthesis platform designed for the enterprise. The platform provides cutting-edge AI voice generator technology and deep forged audio detection for future information security. Features include voice cloning, real-time deep fake audio detection, AI watermarking technology...
10mos ago
02K
魔音工坊:专业配音与短视频解说创作平台|真人配音|克隆声音|一键成片

Magic Voice Workshop: professional voice-over and short video narration creation platform | real person voice-over | clone voice | one-click into a film

Comprehensive Introduction Magic Voice Workshop is a one-stop short video and AI dubbing platform with information on software dubbing, real-life dubbing, sound libraries, cloning services and more. The platform integrates audio editing, AI copy generation, video editing and collaboration tools for audio-related services and content creation. Users experience the audio editor...
10mos ago
01.6K
度加:文案一键成片,急速克隆声音和剪辑精彩片段

Degree Plus: Copywriting into a movie with one click, cloning sound and editing highlights in a hurry

Comprehensive Introduction Duga Creation Tool is an AIGC (Artificial Intelligence Generated Content) creation platform launched by Baidu, aiming to lower the threshold of content generation and improve the efficiency of creation through AI technology. The platform aggregates Baidu's multiple AIGC capabilities to provide a one-stop creation service from inspiration to finished product. Duga's main ...
11mos ago
01.7K
Rask AI:视频多语言翻译与专业语音克隆,视频本地化工具

Rask AI: Video Multilingual Translation with Professional Speech Cloning, Video Localization Tool

General Introduction Rask AI is an intelligent video localization platform designed to provide rapid audio and video production solutions for creators, educators and global businesses. The platform supports automatic translation of video and audio into more than 130 languages to help users expand into global markets. Its special features include video...
12mos ago
02.3K
有道数字人:虚拟形象播报与实时交互平台|免费制作克隆数字人

Arigatou Digital Human: Virtual Image Broadcasting and Real-time Interaction Platform|Free Clone Digital Human Creation

Comprehensive introduction Wealth Digital People is a platform that integrates advanced AI technology, focusing on providing virtual image broadcasting and real-time interactive services. The platform utilizes self-developed speech recognition, speech synthesis, multimodal perception and document Q&A technologies to create realistic digital human doppelgangers for users, supporting video production, translation, teaching...
12mos ago
01.8K