General Introduction
Always-On AI Assistant is an innovative AI assistant project that creates a powerful and permanently online AI assistant system by integrating advanced technologies such as Deepseek-V3, RealtimeSTT and Typer. The project is especially optimized for engineering development scenarios, providing a complete voice interaction interface and command execution framework. The system adopts a modularized design and contains a basic assistant chat interface and an advanced Typer assistant session command system, which supports real-time speech recognition and text-to-speech functions. By integrating ElevenLabs' speech synthesis technology and RealtimeSTT's real-time speech recognition capabilities, the project provides developers with a complete AI assistant development paradigm, making the creation of intelligent voice assistants easier and more efficient.
Function List
- Real-time speech recognition and response system
- Intelligent dialog engine based on Deepseek-V3
- Customizable Typer command execution framework
- Multi-mode operation support (default, execution, memoryless execution)
- Dynamic memory management system (Scratchpad)
- Highly configurable assistant architecture
- Native speech recognition support
- ElevenLabs High Quality Speech Synthesis Integration
- Extensible command template system
- Real-time interactive session capabilities
Using Help
1. Environmental configuration
1.1 Basic configuration
- First clone the project locally
- Copy the environment configuration file: execute
cp .env.sample .env
- Update the API key:
- Set DEEPSEEK_API_KEY (for AI model access)
- Setting ELEVEN_API_KEY (for speech synthesis)
- fulfillment
uv sync
synchronization dependency - Optional: Install Python 3.11 (using the command
uv python install 3.11
)
1.2 System requirements
- Python 3.11 or later.
- Stable network connection
- Microphone equipment (for voice input)
- audio output device (computer)
2. Description of the use of the main functions
2.1 Basic Assistant Chat Interface
- Start command:
uv run python main_base_assistant.py chat
- This opens a basic dialog interface
- Direct text or voice interaction is possible
- Voice response using native TTS
2.2 Typer Assistant Session Command System
- Start command:
uv run python main_typer_assistant.py awaken --typer-file commands/template.py --scratchpad scratchpad.md --mode execute
- Parameter Description:
- --typer-file: specifies the command template file location
- --scratchpad: set the assistant's dynamic memory file
- --mode: set the run mode (default/execute/execute-no-scratch)
2.3 Interacting with assistants
- Clearly pronounced "Ada" wake-up call.
- Say the command, e.g., "Ada, ping the server wait for a response."
- The assistant recognizes speech in real time and executes commands accordingly
- The execution results are recorded in the scratchpad.md file
3. Description of architectural components
3.1 Typer Assistant Architecture
- Brain: using Deepseek V3 as the core AI engine
- Task handling: defined via prompts/typer-commands.xml
- Dynamic memory: state management using scratchpad.txt
- Speech Recognition: Real-time Speech to Text using RealtimeSTT
- Speech Synthesis: Integration with ElevenLabs Provides Natural Speech Output
3.2 Basic assistant architecture
- Core engine: using ollama:phi4
- Simplified design: no additional hints or dynamic memory required
- Speech recognition: also using RealtimeSTT
- Voice output: using the local TTS system
4. Advanced feature configuration
- Assistant configuration can be customized through the assistant_config.yml file
- Support for adding custom Typer commands
- Speech recognition and synthesis parameters can be adjusted
- Supports expansion of new functional modules