Summary: Google researchers have just developed VLOGGER, a new artificial intelligence model that generates realistic talking head videos with full upper body motion from only still images and audio clips.
The details:
VLOGGER creates a controllable avatar that captures similarities and actions.
The model was trained on a large multimedia dataset containing 800,000 videos of people talking and labeled for each part of the face and body.
Potential applications include dubbing videos in other languages, creating realistic avatars for games or assistants, and supporting low-bandwidth video chat.
IMPORTANT: Whether it's providing realism to AI assistants, allowing real-time video voiceovers across languages, or letting us video chat as our favorite avatars, models like VLOGGER are a fascinating foreshadowing of a future where the lines between our physical and digital selves will blur. New Approach.