FLUX.2 [Klein]: Towards Interactive Visual Intelligence

I appreciate that they released a smaller version that is actually open source. It creates a lot more opportunities when you do not need a massive budget just to run the software. The speed improvements look pretty significant as well.

▲

pavelstoev 9 minutes ago | parent | prev | next [-]

If we think of GenAI models as a compression implementation. Generally, text compresses extremely well. Images and video do not. Yet state-of-the-art text-to-image and text-to-video models are often much smaller (in parameter count) than large language models like Llama-3. Maybe vision models are small because we’re not actually compressing very much of the visual world. The training data covers a narrow, human-biased manifold of common scenes, objects, and styles. The combinatorial space of visual reality remains largely unexplored. I am looking towards what else is out there outside of the human-biased manifold.

▲

codezero 2 hours ago | parent | prev | next [-]

I am amazed, though not entirely surprised, that these models keep getting smaller while the quality and effectiveness increases. z image turbo is wild, I'm looking forward to trying this one out.

An older thread on this has a lot of comments: https://news.ycombinator.com/item?id=46046916

	▲	roenxi 2 hours ago \| parent [-]
		There are probably some more subtle tipping points that small models hit too. One of the challenges of a 100GB model is that there is non-trivial difficulty in downloading and running the thing that a 4GB model doesn't face. At 4GB I think it might be reasonable to assume that most devs can just try it and see what it does.

▲

psubocz 22 minutes ago | parent | prev | next [-]

> FLUX.2 [klein] 4B The fastest variant in the Klein family. Built for interactive applications, real-time previews, and latency-critical production use cases.

I wonder what kind of use cases could be "latency-critical production use cases"?

▲

SV_BubbleTime an hour ago | parent | prev [-]

Flux2 Klein isn’t some generation leap or anything. It’s good, but let’s be honest, this is an ad.

What will be really interesting to me is the release of Z-image, if that goes the way it’s looking, it’ll be natural language SDXL 2.0, which seems to be what people really want.

Releasing the Turbo/Distilled/Finetune months ago was a genius move really. It hurt Flux and Qwen releases on a possible future implication alone.

If this was intentional, I can’t think of the last time I saw such shrewd marketing.

▲

refulgentis 42 minutes ago | parent [-]

I’m a bit confused, both you and another commenter mention something called Z-Image, presumably another Flux model?

Your frame of it is speculative, i.e. it is forthcoming. Theirs is present tense. Could I trouble you to give us plebes some more context? :)

ex. Parsed as is, and avoiding the general confusion if you’re unfamiliar, it is unclear how one can observe “the way it is looking”, especially if turbo was released months ago and there is some other model that is unreleased. Chose to bother you because the others comment was less focused on lab on lab strategy.

▲

ollin 32 minutes ago | parent [-]

Z-Image is another open-weight image-generation model by Alibaba [1]. Z-Image Turbo was released around the same time as (non-Klein) FLUX.2 and received generally warmer community response [2] since Z-image Turbo was faster, also high-quality, and reportedly better at generating NSFW material. The base (non-Turbo) version of Z-Image is not yet released.

[1] https://tongyi-mai.github.io/Z-Image-blog/

[2] https://www.reddit.com/r/StableDiffusion/comments/1p9uu69/no...

	▲	refulgentis 28 minutes ago \| parent [-]
		Ahh I see, and Klein is basically a response to Z-Image Turbo, i.e. another 4-8B sized model that fits comfortably on a consumer GPU. It’ll be interesting to see how the NSFW catering plays out for the Chinese labs. I was joking a couple months ago to someone that Seedream 4’s talents at undressing was an attempt to sow discord and it was interesting it flew under the radar. Post-Grok going full gooner pedo, I wonder if it Grok will take the heat alone moving forward.