General Introduction
Xiaozhi AI Chatbot is an open source project based on the ESP32 development board, designed to help users build their own AI chat companion. The project was developed by Shrimp and is mainly used for teaching purposes to help more people get started with AI hardware development and understand how to apply the big language model to actual hardware devices. The project supports speech recognition and conversation functions in multiple languages, including Mandarin, Cantonese, English, Japanese and Korean. Users can learn how to develop with ESP-IDF and experience the various functions of AI chatbots through this program.
Function List
- Wi-Fi / ML307 Cat.1 4G: Supports Wi-Fi connectivity and 4G communication.
- awaken by voice: Supports offline voice wake-up function.
- multilingual recognition: Supports voice recognition in five languages: Mandarin, Cantonese, English, Japanese, and Korean.
- voice recognition: Identify who is shouting AI's name.
- Large model TTS: Supports the text-to-speech feature of Volcano Engine or CosyVoice.
- Large model LLM: Supports Qwen 2.5 72B or the big language model of the beanbag API.
- Customized Roles: Configurable cues and tones to create custom roles.
- short-term memory: Self-summarize after each round of dialogue.
- monitor: Supports OLED or LCD displays to show signal strength or conversation content.
- Hardware Support: Supports a wide range of hardware such as Litronix-Realistic ESP32-S3 development board, Loxin ESP32-S3-BOX3, M5Stack CoreS3, and more.
Using Help
Installation process
- Preparation Hardware: Make sure you have the necessary hardware such as the ESP32 development board, microphone module, speaker module and display.
- Download Firmware: Visit the project's GitHub page to download the latest firmware version.
- Burning Firmware: Use the Flash tool to burn the firmware to the ESP32 development board. The specific steps are as follows:
- Connect the ESP32 development board to the computer.
- Open the Flash tool and select the downloaded firmware file.
- Click the "Burn" button and wait for the burn to complete.
- Configuring the Network: After burning is complete, press the BOOT button on the development board to enter network configuration mode and connect to a Wi-Fi or 4G network.
- Installation of dependencies: Install the necessary software dependencies, such as the ESP-IDF development environment, according to the project documentation.
- Running Projects: After completing the above steps, run the project and start experiencing the AI chat feature.
Instructions for use
- awaken by voice: Speak the wake word into the microphone to wake up the AI chatbot.
- voice dialog: After waking up, you can have a voice conversation directly with the AI, supporting multiple languages.
- Customized Roles: Setting up custom character cues and tones through configuration files.
- Display Function: View conversation content and signal strength on the OLED or LCD display.
- voice recognition: AI can recognize who is calling its name and provide a personalized response.
- short-term memory: After each round of conversation, the AI will summarize itself to enhance the conversation experience.
Detailed Operation Procedure
- Wake-up call and dialog::
- Make sure the device is connected to the network.
- Speak a wake-up word into the microphone, such as "Xiaozhi", and the device will go into standby mode.
- Speak your question or command and the AI will do voice recognition and respond.
- Customized Role Setting::
- Open the configuration file and find the Role Settings section.
- Enter custom cue words and tone parameters and save the file.
- Reboot the device and the new role settings take effect.
- Display use::
- When the device starts up, the display shows the current network signal strength.
- During a conversation, the display shows the contents of the conversation for easy viewing.
- voice recognition function::
- In the configuration file, set the voice recognition parameters.
- When the device is activated, it automatically recognizes the speaker's voiceprint and provides a personalized response.
- short-term memory function::
- After each round of conversation, the AI will automatically summarize and enhance the conversation experience.
- The summary parameters can be adjusted in the configuration file to optimize the memory effect.