Open-LLM-VTuber: Live2D animated AI virtual companion for real-time voice interaction

Latest AI Resources5mos agorelease AI Sharing Circle

1.7K 00

General Introduction

Open-LLM-VTuber is an open source project that allows users to interact with Large Language Models (LLMs) via voice and text, and incorporates Live2D technology to present dynamic virtual characters. It supports Windows, macOS and Linux, runs completely offline, and has both web and desktop client modes. Users can use it as a virtual girlfriend, pet, or desktop assistant, creating a personalized AI companion by customizing its appearance, personality, and voice. The project started as a replica of the closed-source AI virtual anchor "neuro-sama" and has evolved into a feature-rich platform that supports multiple language models, speech recognition, text-to-speech, and visual perception. The current version has been refactored with v1.0.0 and is under active development, with more features to be added in the future.

Function List

voice interaction: Hands-free voice conversation is supported, so users can interrupt the AI at any time for smooth communication.
Live2D animation: Built-in dynamic avatars that generate emoticons and actions based on the content of the conversation.
Cross-platform support: Compatible with Windows, macOS and Linux, supports NVIDIA/non-NVIDIA GPU and CPU operation.
offline operation: All functions can be run without network to ensure privacy and security.
Desktop Pet Mode: Transparent background, global top and mouse penetration are supported, and characters can be dragged to any position on the screen.
visual perception: Video interaction with AI through camera or screen content recognition.
Multi-model support: Compatible with a wide range of LLMs such as Ollama, OpenAI, Claude, Mistral, and other speech modules such as sherpa-onnx and Whisper.
Character Customization: Live2D models can be imported to adjust character and voice.
haptic feedback: Click or drag the character to trigger an interactive response.
Chat Record Keeping: Support for switching historical conversations and retaining interactive content.

Using Help

Installation process

Open-LLM-VTuber needs to be deployed locally, here are the detailed steps:

1. Pre-conditions

software: Support for Windows, macOS or Linux computers with recommended NVIDIA GPUs (can be run without a GPU).
hardware: Install Git, Python 3.10+ and uv (the recommended package management tool).
reticulation: Initial deployment requires Internet access to download dependencies, and it is recommended that Chinese users use proxy acceleration.

2. Downloading code

Clone the project through the terminal:

git clone https://github.com/Open-LLM-VTuber/Open-LLM-VTuber --recursive  
cd Open-LLM-VTuber

Or download the latest ZIP file from GitHub Release and unzip it.
Note: If not used --recursiveRequired to run git submodule update --init Get the front-end submodule.

3. Installation of dependencies

Install uv:

Windows (PowerShell):

irm https://astral.sh/uv/install.ps1 | iex

macOS/Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

Run in the project directory:
```
uv sync
```
Automatically installs FastAPI, onnxruntime and other dependencies.

4. Configuration environment

The first run generates a configuration file:
```
uv run run_server.py
```
Edit the generated conf.yaml, configure the following:
- LLM: Select the model (e.g. Ollama (for llama3 or OpenAI API, the key needs to be filled in).
- ASR: Speech recognition module (e.g. sherpa-onnx).
- TTS: Text-to-speech modules (such as Edge TTS).

Example:

llm:  
provider: ollama  
model: llama3  
asr:  
provider: sherpa-onnx  
tts:  
provider: edge-tts

5. Activation of services

Running:
```
uv run run_server.py
```
interviews http://localhost:8000 Use the web version, or download the desktop client to run it.

6. Desktop client (optional)

Download from GitHub Release open-llm-vtuber-electron(.exe for Windows, .dmg for macOS).
Launch the client and make sure the back-end service is running to experience desktop pet mode.

7. Updating and uninstallation

update: after v1.0.0 uv run update.py Updates, earlier versions need to be redeployed with the latest documentation.
uninstallation: Delete the project folder, check MODELSCOPE_CACHE maybe HF_HOME The model files in the model, uninstalling tools such as uv.

Functional operation flow

voice interaction

Enable Voice: Click on the "Microphone" icon on the web page or in the client.
dialogues: Speak directly and the AI responds in real time; press the "Interrupt" button to interrupt the AI.
make superior: in conf.yaml Adjust the ASR and TTS modules to improve recognition and pronunciation.

Character Customization

Import model: Place the .moc3 file into the frontend/live2d_models Catalog.
Adjustment of personality:: Editorial conf.yaml (used form a nominal expression) promptLike a "gentle big sister".
Sound Customization: Record samples using tools such as GPTSoVITS to generate unique voices.

Desktop Pet Mode

Open modeSelect "Desktop Pets", check "Transparent Background" and "Top" in the client.
moving image: Drag to any position on the screen.
interactivity: Tap on a character to trigger tactile feedback and view an inner monologue or change in expression.

visual perception

Activate camera: Click on "Video Chat" to authorize access.
on-screen recognition: Select "Screen Sense" to have AI analyze the screen content.
typical exampleAsk "what's on the screen" and the AI will describe the image.

caveat

browser (software): Chrome is recommended, other browsers may affect the Live2D display.
performances: GPU acceleration requires properly configured drivers and may run slower on the CPU.
license: The built-in Live2D sample model is subject to a separate license, and commercial use requires contacting Live2D Inc.

The article is copyrighted and should not be reproduced without permission.

CodeArts Doer - AI Intelligent Development Assistant from Huawei Cloud

Latest AI Resources

1mos ago

01K

OctoComics: a creation platform for quickly generating BL comics with AI

Latest AI Resources # AI Image Style Control

4mos ago

01.5K

Sound clipping: Himalaya's natural human voice, multi-narrator audio creation platform

Latest AI Resources # AI text-to-speech # AI audio/video editor

1yrs ago

02.3K

Image AI: Integration of multiple types of AI photo editing tools, free video face changing, easy to start!

Latest AI Resources # AI Image Enlargement and Restoration # AI keying to change backgrounds # AI Face Swap and Dress Up

10mos ago

01.8K

No comments

You must be logged in to leave a comment!

No comments...

Open-LLM-VTuber: Live2D animated AI virtual companion for real-time voice interaction

General Introduction

Function List

Using Help

Installation process

1. Pre-conditions

2. Downloading code

3. Installation of dependencies

4. Configuration environment

5. Activation of services

6. Desktop client (optional)

7. Updating and uninstallation

Functional operation flow

voice interaction

Character Customization

Desktop Pet Mode

visual perception

caveat

Ovis: visual and text alignment model for accurate backpropagation of image cue words

ANP: An Open Source Protocol for Secure and Efficient Communication between Intelligent Agents

Related posts

CodeArts Doer - AI Intelligent Development Assistant from Huawei Cloud

OctoComics: a creation platform for quickly generating BL comics with AI

Sound clipping: Himalaya's natural human voice, multi-narrator audio creation platform

Image AI: Integration of multiple types of AI photo editing tools, free video face changing, easy to start!

No comments

Latest Collections

Latest Articles

Open-LLM-VTuber: Live2D animated AI virtual companion for real-time voice interaction

General Introduction

Function List

Using Help

Installation process

1. Pre-conditions

2. Downloading code

3. Installation of dependencies

4. Configuration environment

5. Activation of services

6. Desktop client (optional)

7. Updating and uninstallation

Functional operation flow

voice interaction

Character Customization

Desktop Pet Mode

visual perception

caveat

Ovis: visual and text alignment model for accurate backpropagation of image cue words

ANP: An Open Source Protocol for Secure and Efficient Communication between Intelligent Agents

Related posts

CodeArts Doer - AI Intelligent Development Assistant from Huawei Cloud

OctoComics: a creation platform for quickly generating BL comics with AI

Sound clipping: Himalaya's natural human voice, multi-narrator audio creation platform

Image AI: Integration of multiple types of AI photo editing tools, free video face changing, easy to start!

No comments

Selected AI Tools

Latest Collections

Latest Articles