AI Personal Learning
and practical guidance
TRAE

Google's Veo 2 Video Generation Comes to Gemini and Whisk, Expanding AI Authoring Tool Territory

Google recently announced that its video generation model, Veo 2, has been officially integrated into the Gemini Advanced service and in its experimental platform Whisk. The move means Google One AI Premium subscribers can now generate short video content directly from text prompts or existing images.

Veo 2 is positioned by Google as its advanced video generation technology designed to convert text descriptions into up to 8-second, 720p resolution, 16:9 aspect ratio videos in MP4 format. The model is said to be enhanced in its understanding of realistic physical laws and human movement, and is capable of generating video clips with smooth movements, realistic scenes and rich details, covering a diverse range of topics and styles.


Google Veo 2 Video Generation Comes to Gemini and Whisk, Expanding AI Authoring Tool Territory-1

 

Text-to-video generation in Gemini

In Gemini Advanced, the user can create a video by selecting the Veo 2 model from a drop-down menu. The process is relatively straightforward: the user enters a detailed description of the scene, and Gemini attempts to generate a video. The official demo shows different styles of generation, for example:

  • Scene one: A wide, slow-moving camera sweeps across a massive glacial cavern, where two figures in white exoskeleton suits walk among them, helmet lights illuminating frozen, candy-like objects in the ice walls.
    • Link to sample video: https://storage.googleapis.com/gweb-uniblog-publish-prod/original_videos/Gemini_Generated_Video__37_aDEwjss.mp4
  • Scene two: Animated style, a mouse with oversized glasses reads a book by the light of a glowing mushroom in a cozy forest lair.
    • Link to sample video: https://storage.googleapis.com/gweb-uniblog-publish-prod/original_videos/Mouse_Reads_Video_Generated.mp4
  • Scene Three: Aerial view of grass-covered cliffs connecting to a sandy beach with waves lapping at the shore and a prominent sea pillar standing in the sea, bathed in the golden glow of sunrise or sunset.
    • Link to sample video: https://storage.googleapis.com/gweb-uniblog-publish-prod/original_videos/Gemini_Generated_Video__13.mp4
  • Scene Four: Somatotropic style time-lapse of a pink, gray and white ice cream melting under a clear blue sky.
    • Link to sample video: https://storage.googleapis.com/gweb-uniblog-publish-prod/original_videos/KR_Veo2_4.mp4

Google emphasizes that the more detailed the description, the more control the user has over the final video. This feature opens up new possibilities for quickly visualizing concepts, narrating short visual stories or making creative combinations. Generated videos can be easily uploaded to platforms such as TikTok or YouTube Shorts via share buttons.

It's worth noting that Veo 2 currently generates videos limited to 8 seconds in length at 720p resolution, and while this is sufficient to meet some of the needs of short-form video platforms or for a quick proof-of-concept, it's not the same as what the industry is doing (e.g., OpenAI), but it's not the same as what the industry is doing. Sora The current use of Veo 2 in Gemini appears to be more focused on a lightweight, on-the-fly authoring experience than the trend toward longer durations, higher resolutions, and greater narrative power (as demonstrated by the model). In addition, the feature has a monthly generation limit, which may affect the creative process for heavy users.

This video generation feature is being rolled out globally to Gemini Advanced web and mobile users in all languages supported by Gemini.

 

Whisk Animate: Making Still Images Move

In addition to text-generated video, Google is also bringing the power of Veo 2 to the Whisk platform with Whisk Animate, an experimental project launched by Google Labs last December that allows users to explore and visualize ideas by combining text and image prompts.

Now, with Whisk Animate, Google One AI Premium subscribers can turn still images they've created or uploaded into 8-second motion videos. This is a convenient tool for those who want to add motion to their existing images. The feature is currently available in over 60 countries.

  • Whisk Animate related introductory video link: https://www.youtube.com/watch?v=2yYDI-p5aGs (original link is a thumbnail, presumed YouTube viewing link provided here)

Integrating video generation capabilities into Gemini and Whisk shows Google's strategy of looking to integrate AI authoring tools into its existing ecosystem and subscription services. This lowers the barrier for users to access and use advanced AI capabilities, but also ties them to specific paid subscriptions.

 

Safety Considerations and Industry Responsibility

Along with the launch of the video generation feature, Google also mentioned the security measures it has taken. This includes extensive "Red Teaming" and evaluation to prevent the generation of content that violates its policies.

A key initiative is that all videos generated by Veo 2 will be embedded with a SynthID digital watermark. This watermark is designed to be embedded in every frame of the video to identify that the video was generated by AI. Against the backdrop of the increasing prevalence of AI-generated content, where authenticity is difficult to distinguish, the adoption of reliable watermarking technology is critical to enhancing transparency and combating disinformation, and is an integral part of responsible AI development.

Google also acknowledges that, like all generative AI tools, Gemini's output is largely dictated by user prompts, may generate objectionable content in some cases, and encourages users to provide input via the feedback button for continuous improvement.

May not be reproduced without permission:Chief AI Sharing Circle " Google's Veo 2 Video Generation Comes to Gemini and Whisk, Expanding AI Authoring Tool Territory
en_USEnglish