Google announces video generation AI 'Veo 3', capable of 4K output and simultaneous audio generation



Google has announced the release of its video generation model, Veo 3. Not only has it improved the quality of the previous model,

Veo 2 , which can create videos with a resolution of up to 4K, but it is also the first Google video generation AI to generate videos with audio.

Veo - Google DeepMind
https://deepmind.google/models/veo/

Imagen 4, Veo 3: Google's latest media generation models
https://blog.google/intl/ja-jp/company-news/technology/aigenerative-media-models-io-2025/

Below is a video generated with Veo 3. The sound of waves and dialogue were also generated at the same time as the video.




It can also generate human voices, traffic sounds in the city, birds singing in the park, and conversations between animated characters. Until now, most video generation models could only generate silent videos, so Google announced Veo 3 and said, 'Say goodbye to the silent era of video generation.'




You can generate videos containing a variety of sounds, such as the sound of paper rustling or dynamic sound effects.




Veo 3 accepts input prompts in the form of text or images, and is said to be able to mirror real-world physics and achieve accurate lip-syncing. It also has excellent comprehension, and can generate vivid images based on a short story just by telling it a short story.

Veo 3 will be available to subscribers of Google's highest-end AI plan, Google AI Ultra , which launched on May 21, 2025. Google AI Ultra is available only in the United States at launch.

In addition, based on the knowledge gained from working with creators and filmmakers while developing the Veo 3, the company has added new functions to the Veo 2 model as well. These include image-based generation adjustments, setting camera movements such as rotation and zoom, expanding the frame to change videos from portrait to landscape, and adding or removing objects in videos.

Even if you provide separate images of scenes, characters, and objects, we can merge them together to generate a single video.



By providing an image that defines your style (top left below), you can generate a video with a similar visual.



If you provide an image of your character (top left below), you can have the character appear in your video while maintaining its appearance.



in Software,   Video, Posted by log1p_kr