Looking at the file sizes on the open weights version (https://huggingface.co/black-forest-labs/FLUX.2-dev/tree/mai...), the 24B text encoder is 48GB, the generation model itself is 64GB, which roughly tracks with it being the 32B parameters mentioned.

Downloading over 100GB of model weights is a tough sell for the local-only hobbyists.

▲

BadBadJellyBean 7 hours ago | parent | next [-]

Never mind the download size. Who has the VRAM to run it?

	▲	pixelpoet 7 hours ago \| parent [-]
		I do, 2x Strix Halo machines ready to go.

▲

zamadatix 7 hours ago | parent | prev | next [-]

100 GB is less than a game download, it's actually running it that's a tough sell. That said, the linked blog post seems to say the optimized model is both smaller and greatly improved the streaming approach from system RAM, so maybe it is actually reasonably usable on a single 4090/5090 type setup (I'm not at home to test).

▲

_ache_ 7 hours ago | parent | prev | next [-]

Even a 5090 can handle that. You have to use multiple GPUs.

So the only option will be [klein] on a single GPU... maybe? Since we don't have much information.

	▲	Sharlin 5 hours ago \| parent [-]
		As far as I know, no open-weights image gen tech supports multi-GPU workflows except in the trivial sense that you can generate two images in parallel. The model either fits into the VRAM of a single card or it doesn’t. A 5ish-bit quantization of a 32Gw model would be usable by owners of 24GB cards, and very likely someone will create one.

▲

crest 4 hours ago | parent | prev [-]

The download is a trivial onetime cost and so is storing it on a direct attached NVMe SSD. The expensive part is getting a GPU with 64GB of memory.