Weebo: a real-time voice chatbot that provides a natural language conversational experience

Latest AI Resources7mos agorelease AI Sharing Circle

2.5K 00

General Introduction

Weebo is an open source real-time voice chatbot that utilizes the Whisper Small for speech recognition, Llama 3.2 for natural language generation, and Kokoro-82M for speech synthesis. Developed by Amanvir Parhar, the project aims to provide an efficient voice dialog solution that runs on native devices.Weebo supports multiple voices and smoothly generates real-time responses for a wide range of application scenarios that require voice interaction.

Function List

Real-time speech recognition: Efficient speech-to-text processing using the Whisper Small model.
Natural Language Generation: Generate natural language responses with the Llama 3.2 model.
Speech Synthesis: Converts text to speech using the Kokoro-82M model.
Multi-sound support: Provides multiple sound options to enhance the user experience.
Runs locally: No need to rely on cloud services, all processing is done on the local device.
Open source code: the code is open, allowing users to freely modify and extend the functionality.

Using Help

Installation process

Download the required model:
- Download Kokoro-82M model file kokoro-v0_19.onnx and placed in the project folder.
- utilization Ollama The tool pulls the Llama 3.2 model.
Clone Weebo project code:

   git clone https://github.com/amanvirparhar/weebo.git
cd weebo

Install the dependencies:

   pip install -r requirements.txt

Run the chatbot:

   python main.py

Instructions for use

After starting the program, Weebo will start listening for voice input.
Users can speak naturally and Weebo will generate a voice response after a short pause.
check or refer to Ctrl+C The program can be stopped.

Main function operation flow

speech recognition: Weebo uses the Whisper Small model for speech recognition and is able to accurately convert a user's speech into text.
natural language generation: Using the Llama 3.2 model, Weebo understands the user's speech input and generates a natural language response.
speech synthesis: Using the Kokoro-82M model, Weebo converts the generated text response into speech and plays it back over the loudspeaker.
Multi-Voice Support: Users can select different sound models in the configuration file to meet different application requirements.

Detailed steps

Launch Weebo: Run python main.pyThe program will start listening to the user's voice input.
voice input: Users can speak directly into the microphone and Weebo will automatically recognize and process the voice.
Generating a Response: After recognizing the speech, Weebo generates a natural language response using the Llama 3.2 model and converts it to speech using the Kokoro-82M model.
Playback Response: The generated voice response is played through the speaker and the user can hear Weebo's answer.
stop program: Press Ctrl+C Weebo can be stopped at any time.

With the above steps, users can easily start using Weebo to have real-time voice conversations and experience natural and smooth voice interaction.