AI Personal Learning
and practical guidance
Beanbag Marscode1
Total 69 articles

Tags: ai text to speech

Paper to Podcast:把学术论文转换为多人对话播客-首席AI分享圈

Paper to Podcast: Converting Academic Papers to Multi-Person Conversation Podcasts

General Introduction Paper to Podcast is an open source tool that specializes in transforming academic research papers into lively and entertaining podcasts. It makes complex academic content easy to understand by using artificial intelligence technology to turn a PDF-formatted paper into a conversation between three characters - the host, the learner, and the expert. This ...

MegaTTS3:合成中英文语音的轻量模型-首席AI分享圈

MegaTTS3: A Lightweight Model for Synthesizing Chinese and English Speech

Comprehensive Introduction MegaTTS3 is an open source speech synthesis tool developed by ByteDance in cooperation with Zhejiang University, focusing on generating high-quality Chinese and English speech. Its core model is only 0.45B parameters , lightweight and efficient , support for mixed Chinese and English speech generation and speech cloning . The project is hosted on GitHub , ti...

猫与星:和孩子一起编写专属童话故事的听故事APP-首席AI分享圈

Cat and Star: a story-listening app that writes exclusive fairy tales with your child

Comprehensive Introduction "Cat & Star" (maoyuxing.com) is an interactive story creation platform designed for children, helping parents and children to create personalized fairy tales together through mobile applications. Users can enter the child's name, preferences and other information to generate unique story content, allowing the child to become the story...

Llasa 1~8B:高品质语音生成和克隆的开源文本转语音模型-首席AI分享圈

Llasa 1~8B: an open source text-to-speech model for high quality speech generation and cloning

General Introduction Llasa-3B is an open source text-to-speech (TTS) model developed by the Audio Lab of the Hong Kong University of Science and Technology (HKUST Audio). The model is based on the Llama 3.2B architecture, which has been carefully tuned to provide high-quality speech generation that not only supports multiple languages, but also enables emotional expression and personality...

en_USEnglish