- Google’s Video Reneer Model got a major upgrade
- Advertised at Google I/O, VEO 3 can combine audio and video in its output
- It’s an ultra and only American function for now
AI video generation tools like Sora and Pika can create alarmingly realistic videos, and with sufficient effort you can tie these clips together to create a short film. However, one thing they cannot do is at the same time generate sound. Google’s new VEO 3 model can and it can be a game election.
VEO 3, which was announced on Tuesday on Google I/O 2025, is the third generation of the powerful Gemini Video Overlay Model. With the right prompt, it can produce videos that include sound effects, background sounds and yes, dialogue.
Google briefly demonstrated this capacity to the video model. The clip was a CGI-Class animation of some animals that spoke in a forest. The sound and video were in perfect synchronization.
If the demo can be converted to use in the real world, this represents a remarkable tilt point in the AI content generation space.
“We come out of the silent era of video cleaning,” Google Deepmind CEO Demis Hassabis said in a press call.
Light, camera, sound
He is not wrong. So far, no other AI Video Other Model can at the same time deliver synchronized sound or sound of any kind to accompany video output.
It is still not clear whether VEO 3, which, like its predecessor, VEO 2, should be able to broadcast 4K video, surpass the current video manager Openai Sora in the video quality department. Google has previously claimed that VEO 2 is adept at producing realistic and consistent movement.
Whatever emits what appears to be fully produced video clips (video and Audio) can instantly make VEO a more attractive platform.
It is not only that VEO 3 can handle dialogue. In the film and TV world, background sounds and sound effects are often the work of Foley artists. Now imagine if all you need to do is describe to weigh the sounds you want to have behind and attached to the action, and it emits it all, including video and dialogue. These are work that takes animators weeks or months to do.
In a release on the new model, Google suggests that you tell AI “A short story in your prompt and the model gives you back a clip that brings it to life.”
If VEO 3 can follow prompts and starting minutes or eventually, hours of consistent video and audio, it won’t be long before we see the first animated feature generated completely through VEO.
VEO is live today and available in the US as part of the new Ultra Tier ($ 249.99 a month) in the Gemini app and also as part of the new flow tool.
Google also advertised a few updates to its VEO 2 videooring model, including the possibility of generating video based on reference loads you provide, camera control, gang to convert from portrait to landscape and object addition and delete.
@Techradar ♬ Original Sound – Techradar