Remix.run Logo
SV_BubbleTime 2 hours ago

For the more technical…

Qwen 2512 (December edition of Qwen Image)

* 19B parameters, which was a 40GB file at FP16 and fit on a 3090 at FP8. Anything less than that and you were in GGUF format at Q6 to Q4 quantizations… which were slow, but still good quality.

* used Qwen 2.5 VL. So a large model and a very good vision model.

* And iirc, their own VAE. Which had known and obvious issues of high frequency artifacts. Some people would take the image and pass it through another VAE like WAN Video model’s or upscale-downscale to remove these

Qwen 2 now is

* a 7B param model. Right between Klein 9B (non-commercial) this (license unknown), Z-Image 7B (Apache), and Klein 4B (Apache). Direct competition, will fit on many more GPUs even at FP16.

* upgrades to Qwen 3 VL, I assume this is better than the already great 2.5 VL.

* Unknown on the new VAE. Flux2’s new 128 channel VAE is excellent, but it hasn’t been out long enough for even a frontier Chinese model to pick up.

Overall, you’re right this is on the trend to bring models on to lower end hardware.

Qwen was already excellent and now they rolled Image and Edit together for an “Omni” model.

Z-Image was the model to beat a couple weeks ago… and now it looks like both Klein and Qwen will! Z-Image has been disappointing to see how it just refuses to adhere to multiple new training concepts. Maybe they tried to pack it too tightly.

Open weights for this will be amazing. THREE direct competitors all vying to be “SDXL2” at the same time.

The Qwen convention was confusing! You had Image, 2509, Edit, 2511 (Edit), 2512 (Image) and then the Lora compatibility was unspecified. It’s smart to just 2.0 this mess.

liuliu 2 hours ago | parent | next [-]

Note that Qwen Image 1.0 (2512) wasted ~8B weights on timestep embedding. Both Z-Image / FLUX.2 series corrected that.

vunderba an hour ago | parent | prev [-]

Agreed! A lot of people were also using ZiT as a refiner downstream to help with some of the more problematic visual aspects of the original Qwen-Image.

I'm really looking forward to running the unified model through its paces.