May 21, 2025 11:25:00

Google announces video generation AI 'Veo 3', capable of 4K output and simultaneous audio generation

Google has announced the release of its video generation model, Veo 3. Not only has it improved the quality of the previous model,

Veo 2 , which can create videos with a resolution of up to 4K, but it is also the first Google video generation AI to generate videos with audio.

Veo - Google DeepMind
https://deepmind.google/models/veo/

Imagen 4, Veo 3: Google's latest media generation models
https://blog.google/intl/ja-jp/company-news/technology/aigenerative-media-models-io-2025/

Below is a video generated with Veo 3. The sound of waves and dialogue were also generated at the same time as the video.

Say goodbye to the silent era of video generation: Introducing Veo 3 — with native audio generation. 🗣️

Quality is up from Veo 2, and now you can add dialogue between characters, sound effects and background noise.

Veo 3 is available now in the @GeminiApp for Google AI Ultra… pic.twitter.com/7rcXeBslyU
— Google (@Google) May 20, 2025

It can also generate human voices, traffic sounds in the city, birds singing in the park, and conversations between animated characters. Until now, most video generation models could only generate silent videos, so Google announced Veo 3 and said, 'Say goodbye to the silent era of video generation.'

Animate your story in your style with Veo 3. 🖌️

Here are some of our favorite videos. Sound on. 🔈 https://t.co/5wUMEaqNdD 🧵 pic.twitter.com/vl1R4nZJT4
— Google DeepMind (@GoogleDeepMind) May 20, 2025

You can generate videos containing a variety of sounds, such as the sound of paper rustling or dynamic sound effects.

Video, meet audio. 🎥🤝🔊

With Veo 3, our new state-of-the-art generative video model, you can add soundtracks to clips you make.

Create talking characters, include sound effects, and more while developing videos in a range of cinematic styles. 🧵 pic.twitter.com/5Hfpetfg8b
— Google DeepMind (@GoogleDeepMind) May 20, 2025

Veo 3 accepts input prompts in the form of text or images, and is said to be able to mirror real-world physics and achieve accurate lip-syncing. It also has excellent comprehension, and can generate vivid images based on a short story just by telling it a short story.

Veo 3 will be available to subscribers of Google's highest-end AI plan, Google AI Ultra , which launched on May 21, 2025. Google AI Ultra is available only in the United States at launch.

In addition, based on the knowledge gained from working with creators and filmmakers while developing the Veo 3, the company has added new functions to the Veo 2 model as well. These include image-based generation adjustments, setting camera movements such as rotation and zoom, expanding the frame to change videos from portrait to landscape, and adding or removing objects in videos.

Even if you provide separate images of scenes, characters, and objects, we can merge them together to generate a single video.

By providing an image that defines your style (top left below), you can generate a video with a similar visual.

If you provide an image of your character (top left below), you can have the character appear in your video while maintaining its appearance.

May 21, 2025 11:25:00 in Software, Video, Posted by log1p_kr