Linly-Talker: An Intelligent Dialogue System for Digital People, Combining Big Language Modeling and Visual Modeling for a New Interactive Experience

Latest AI Resources8mos agorelease AI Sharing Circle

26.4K 00

General Introduction

Linly-Talker is an innovative digital human dialog system that combines Large Language Models (LLMs) with visual models to create a novel approach to human-computer interaction. The system integrates a variety of technologies such as Whisper, Linly, Microsoft Speech Services and SadTalker generating system designed to provide a realistic digital human conversation experience.Linly-Talker supports users to upload images for conversations and enhances interactivity and realism through a multi-round dialog system. The project was developed by Kedreamix and is open-sourced on GitHub for developers and researchers to use and improve.

Linly-Talker：数字人智能对话系统，结合大语言模型与视觉模型，实现互动新体验

Function List

multi-wheel dialog system (MDS): Supports contextualized multi-round conversations for enhanced interactivity and realism.
Image Upload Dialog: Users can upload images and talk to digital people.
Speech synthesis and recognition: Integrates with Microsoft TTS and FunASR to provide multiple speech types and fast speech recognition.
Video Subtitle Generation: Supports video subtitle generation for enhanced visual effects.
voice cloning: With the GPT-SoVITS model, voices can be cloned using one minute of speech data.
Personalized Character Generation: Supports personalized role generation with multiple models and options.
real time chat: Integration with MuseTalk for basic real-time conversation functionality.

Using Help

Installation process

cloning project: Run the following command in the terminal to clone the project:

   git clone https://github.com/Kedreamix/Linly-Talker.git

Installation of dependencies: Go to the project directory and install the required dependencies:

   cd Linly-Talker
pip install -r requirements_app.txt
pip install -r requirements_webui.txt

Configuration environment: Configure environment variables and certificates as needed to ensure proper system operation.

Guidelines for use

Starting the WebUI: Run the following command to start the WebUI:

   python webui.py

Open your browser to access http://localhost:7860The Linly-Talker web interface can be accessed by clicking on the following link.

Uploading images for conversation::
- In the WebUI interface, click the "Upload Image" button and select the image file to be uploaded.
- Once the image is uploaded, the system automatically generates a dialog that allows the user to interact with the digital person.
Speech synthesis and recognition::
- Input text in the dialog box, select the voice type, click "Generate Voice" button, the system will synthesize the voice and play it.
- Users can also enter their voice through the microphone and the system will automatically recognize and generate text.
Video Subtitle Generation::
- Upload video files, the system will automatically generate subtitles and embed them in the video, and users can download the video files with subtitles.
voice cloning::
- Upload a voice sample of the target person and the system will use the GPT-SoVITS model for voice cloning to generate a voice similar to the target person.
Personalized Character Generation::
- In the WebUI interface, select the "Personalized Character Generation" option, enter the character information, and the system will generate a personalized digital persona.
real time chat::
- By selecting the MuseTalk module, the system will turn on the real-time dialog feature, which allows the user to interact with the digital person in real time.