SoulX-Podcast - Soul AI Lab's Open Source Conversational Speech Synthesis Model
What is SoulX-Podcast
SoulX-Podcast is an advanced multi-speaker conversational speech synthesis model open-sourced by Soul AI Lab, designed for generating high-quality podcast content. It is capable of generating multiple rounds of conversations, simulating smooth conversations in real podcasting scenarios, and supports Mandarin, English, and multiple Chinese dialects, such as Sichuan, Henan, and Cantonese, and supports cross-dialect zero-sample speech cloning, which can generate different dialects based on a single audio cue. The model incorporates a paralinguistic control function, which can generate non-verbal elements such as laughter and sighs to enhance the naturalness of speech. In long form conversations, SoulX-Podcast maintains stable timbre and natural rhythmic variations, generating coherent conversations of up to 90 minutes in length.

Features of SoulX-Podcast
- Multi-speaker dialog generation: Generate multi-speaker conversations lasting up to 90 minutes, with stable tones and natural rhythmic variations, suitable for multi-round conversational scenarios such as podcasts.
- Multi-language and dialect support: Supports Mandarin, English, and multiple Chinese dialects (e.g., Sichuan, Henan, Cantonese, etc.), with cross-dialect voice cloning.
- paralanguage control: Paralinguistic elements such as laughter, sighs and breathing sounds can be generated to enhance the naturalness and realism of synthesized speech.
- Long-form conversational coherence: Ensuring coherence and emotional continuity in long form conversations through contextual regularization mechanisms.
- Zero sample text to speech synthesis: The ability to generate high-quality personalized speech without a sample of the target speaker's voice.
- High Performance Speech Synthesis: It also performs well in traditional single-person speech synthesis tasks, reaching industry-leading levels.
- Open Source and Ease of Use: Provides open source code and detailed installation guides for developers to use and extend.
Core Benefits of SoulX-Podcast
- Multi-speaker dialog generation: Can generate natural and smooth multi-round conversations, suitable for multi-speaker scenarios such as podcasts.
- Multi-language and dialect supportSupport Mandarin, English and many Chinese dialects, with cross-dialect speech cloning capability.
- paralanguage control: Supports the generation of paralinguistic elements such as laughter and sighs to enhance speech naturalness.
- Long-form conversational coherence: Can generate coherent dialogues lasting up to 90 minutes, maintaining a steady change of tone and rhythm.
- Zero sample text to speech synthesis: Personalized speech can be generated without the need for a sample of the target speaker's voice.
- High performance and quality: Excellent performance in traditional single-person speech synthesis tasks, reaching industry-leading levels.
What is SoulX-Podcast's official website?
- Project website:: https://soul-ailab.github.io/soulx-podcast/
- GitHub repository:: https://github.com/Soul-AILab/SoulX-Podcast
- HuggingFace Model Library:: https://huggingface.co/collections/Soul-AILab/soulx-podcast
- arXiv Technical Paper:: https://arxiv.org/pdf/2510.23541
Who is SoulX-Podcast for?
- Podcast Creator: Generates high-quality multi-speaker dialog content suitable for producing podcasts.
- content creator: Can be used to generate audio content such as audio stories, virtual interviews, etc.
- Virtual Assistant Developer: Multi-language and dialect support provides natural and smooth voice interaction for virtual assistants.
- language researcher: Supports multiple languages and dialects and can be used for linguistic research and dialect preservation projects.
- educator: Can be used to produce educational audio content to support multilingual teaching and language learning.
- Entertainment industry practitioners: It can be used to generate the voice of virtual characters for games, animation and other fields.
© Copyright notes
Article copyright AI Sharing Circle  All, please do not reproduce without permission.
Related articles
No comments...





 English
English  简体中文
简体中文  日本語
日本語  한국어
한국어  Русский
Русский  Español
Español