Alibaba launches image generation AI 'Qwen VLo,' which uses a progressive generation method to draw images from the top down like TV scan lines



Qwen, Alibaba's AI development team, announced the image generation AI ' Qwen VLo ' on Thursday, June 26, 2025. Qwen VLo has a high level of content understanding in images and can perform accurate image editing. In addition, it uses a progressive generation method, and a major feature is that images are generated in order from the top left.

Qwen VLo: From 'Understanding' the World to 'Depicting' It | Qwen

https://qwenlm.github.io/blog/qwen-vlo/

Below is a demo video showing the image generation process of Qwen VLo. Many existing image generation AI models use a generation method of 'roughly depicting the entire image and gradually increasing its resolution,' but Qwen VLo uses a progressive generation method, in which the image is generated gradually from left to right and top to bottom. Qwen VLo continuously improves and optimizes the predictions during image generation to ensure consistency in the final generation results. The development team appeals to the progressive generation method, saying, 'Not only does it improve visual quality, it also provides users with a flexible and controllable creative experience.'

Alibaba's image generation AI 'Qwen VLo' generates images - YouTube


Qwen VLo is available in the chat AI ' Qwen Chat '. It supports Chinese and English languages, and not only can you generate images from text, but you can also enter and edit images.



Compared to previous models, Qwen VLo has an enhanced ability to recognize content within images, and can understand natural language instructions such as 'change the color of the car to red' and accurately reflect them in the editing results.

Below is an example of 'image editing using Qwen VLo' shown by the development team. First, the original image looks like this.



When the user selected 'change to real photo,' the image changed to a real-life-like image, while keeping the original scene of 'a bear wearing a white T-shirt sitting and eating watermelon.'



If you say 'change background to the Eiffel Tower,' the background will change as instructed.



When you enter '变成气ボール飘到空(Change to a balloon floating in the air)', it looks like this. It's a simple instruction, but it successfully turns only the bear part into a balloon.



At the time of writing, Qwen VLo is in the preview stage and there is a possibility of problems such as 'inconsistency with the prompt' and 'inconsistency with the original image.' The development team has indicated that they will continue to work on improving the model.

in Software, Posted by log1o_hf