It was just a glimpse, two 8-second VEO 3 videos, but as with so many life-changing things, I will never forget my first time to generate synchronized sound and video with a skillful designed prompt.
I am currently running Google AI Pro, $ 19.99 a month, giving you access to the Gemini 2.5 Pro model and, more importantly, a limited test with VEO 3 -Videor Rearing.
VEO 3 is the tilting level of generative video creation that for the first time makes it possible to create videos with dialogue, background sounds and sound effects, all synchronized to the action.
While I understood that my VEO 3 access may have been limited, I wasn’t sure how many videos I could generate with the new model. The answer, it seems, is exactly two. If I want unlimited access, I can switch to Google AI Ultra for an eye-watering $ 249.99 a month (there is a three-month deal for $ 124.99 a month). And VEO 3 is currently only US.
Since VEO 3 was launched on Google I/O 2025, my Tiktok -Feed has been filled with these incredible and often quite realistic AI clips. Some look like infomercials or commercials, others are just impossible, like a woman interviewing a smiling man who is obviously on fire.
I was torn between creating realism, hyperrealism and something amazing. In the end, I built a prompt in the Gemini 2.5 Pro window that supports video creation that was a mix of sci-fi, drama and whimsy.
However, writing inside the fast window turned out to be a mistake because I accidentally hit the return before I fully secrete my idea and suddenly the VEO 3 was busy generating my video.
This was my first prompt:
“Bill and Jessica live in a log cabin built on the surface of Mars. Bill comes out of the cabin to find Jessica fighting a Martian who uses only a stuffed animal.
Bill Scream at Jessica: What are you doing?
Jessica: This damn Martian wants our country and he can’t have it. “
As you can see, there is not much detail, and as easy as it is to generate a video in VEO 3 (and the audio -free VEO 2), you get a better result by including more details and dialogue. VEO 3 will not have the figures to say something you don’t scripte. In this case, because I hit the return too soon, Jessica’s dialogue is cut off and I did not get to polish my prompt.
Still, VEO 3 took the poor details and created a striking piece of video in about 5 minutes. View (sound up to the full effect).
It’s far from perfect. In fact, Bill does not speak his line, even if we hear it from off camera. Jessica’s screams (or is it Martian’s?) Also comes from a place from the camera.
There is an unfortunate sound effect that may come from Bill and which I did not do script. I also don’t know why Jessica speaks her lines directly to the camera.
Again, I assume that if I had instructed who she should talk to, VEO 3 could have made another choice.
There are still so many more subtle things that are impressive. VEO 3 gets the setting right; Note the reddish cloudy of Mars day lights. Martian is scary. However, I am more impressed with the sound effects such as the sound of the cabin door, the footfall on the martical soil and the sound of the stuffed animal that strikes Martian’s chest.
Take 2
To my second prompt, I wrote and edited it outside Gemini. I did my best to set the scene, describe the characters and delineate the dialogue and any sound effects. Here’s the prompt:
The scene is a lush forest with sunlight flowing in from overhead. We hear screams from pterodactyls in the background and the sound of leaves swinging in a light breeze.
A tyrannosaurus carefully paints a large canvas that depicts a colorful image of a man being destroyed by an asteroid.
Tyrannosaurus sings quietly for himself, “Pink Pony Club, I continue to dance on …”
A velociraptor wanders over and asks, “Why are you painting it?”
Tyrannosaurus: “Ai made me do it.”
Velociraptor backs away in horror and says, “What? !!”
As you can see, I was partly inspired by some of the self-referring VEO 3 videos I had seen on Tiktok, where the characters break the fourth wall and mention that they are AIS in a video. While my detail works mostly paid off, VEO made a number of questionable choices.
I don’t know why it chose to dress T-Rex, but neglected to give him a brush, or why the character in the painting looks like a kind of 1970-century detective. And although Gemini clearly by a thing or two about what dinosaurs look like, it got the relative sizes of T-Rex and Velociraptor everything wrong. I was also disappointed that instead of “screams from pterodactyls”, I got a static picture of pterodactyls and the sound of bird song in the background.
The synchronization of dialogue is mostly good, although I hoped for more emoting from Velociraptor.
Generally, it took me a few minutes to write these prompts and another 3 to 5 minutes for VEO 3 to generate each video. I think if I spent more time painting a detailed picture, even writing an entire short story, I can get an even better result.
I wanted to tell you that with certainty but I just ran my short trial dry. If you plan to attend a few VEO 3 videos here are my core tips:
- Write your prompt outside Gemini
- Choose your items carefully
- Spell every detail from the characters to the scene
- Details Each action or VEO 3 will do something up or have a character that does nothing
- Stick the dialogue so that is clear.
- Describe the feeling behind the delivery of dialogue
- Include details of background sounds
- Include sound effect descriptions if you want specific sounds
- Each video is a maximum of 8 seconds. Plan accordingly
- Try to create more videos that continue a story but keep descriptions consistent
Good luck with your VEO 3 test drives. Let me know how it goes in the comments below.