Google DeepMind Launches Veo 2 to Challenge OpenAI’s Sora – Ankor Tech
Spread the love

Google DeepMind has officially unveiled Veo 2, its latest leap in generative AI video technology designed to directly compete with OpenAI’s Sora. Announced this Monday, the model represents a significant evolution of the original Veo, boasting the capability to generate high-fidelity video clips exceeding two minutes in duration at up to 4K resolution (4096 x 2160 pixels).

Google VideoFX
Veo 2 in VideoFX. Image Credits: Google

Technical Superiority vs. Experimental Reality

On paper, Veo 2 dwarfs its primary rival. Its maximum resolution is four times higher than Sora’s, and its potential clip duration is over six times longer. However, these figures represent theoretical benchmarks. Within Google’s experimental VideoFX platform, where the model is currently being tested, outputs remain restricted to 720p and eight-second durations.

Eli Collins, VP of product at DeepMind, confirmed that while access is currently limited via a waitlist, the company plans to broaden user availability this week. Furthermore, Veo 2 is slated for integration into the Vertex AI developer platform as the technology matures for commercial scale.

Enhanced Physics and Cinematic Precision

Veo 2 introduces significant improvements in motion modeling, fluid dynamics, and light physics. The model demonstrates a heightened ability to render complex substances like liquids and reflections, while offering users more granular control over virtual camera movements and angles.

Google Veo 2
Google Veo 2 sample. Image Credits: Google

Despite these advancements, DeepMind acknowledges that the model still struggles with “coherence and consistency” over longer timeframes. Issues such as character stability, intricate motion, and occasional artifacts—often referred to as the “uncanny valley”—remain active areas of development. The team is currently collaborating with creative professionals, including Donald Glover and The Weeknd, to refine the tool’s practical utility.

Google Veo 2
Image Credits: Google

Training Data and Safety Protocols

DeepMind maintains that Veo 2 was trained on high-quality video-description pairings. While the company has not explicitly disclosed the full scope of its training sources, it has previously noted that YouTube content may be utilized. This has sparked ongoing industry debates regarding copyright, as DeepMind asserts that training on public data constitutes fair use.

Google Veo 2
Image Credits: Google

To address risks like deepfakes and regurgitation, Google is implementing prompt-level safety filters and its proprietary SynthID watermarking technology. However, the company admits that watermarking is not entirely foolproof against sophisticated manipulation.

Google Veo 2
Image Credits: Google

Imagen 3 Updates

Parallel to the Veo 2 launch, Google has upgraded Imagen 3 for the ImageFX tool. This update focuses on improved prompt adherence and higher-quality texture rendering. The interface has also been updated with “chiplets”—interactive drop-down menus that allow users to quickly iterate on specific elements within their text prompts.

Google ImageFX
Image Credits: Google
Google Veo 2
Image Credits: Google