I have 24GB of VRAM (via a RTX 4090) and run Qwen3.6-35b:iq4, so it's importance-aware quantization and isn't nearly as dumb as it sounds like, fitting the 35b into 18 GB so you have some left over. So far I've had no issues, other than it taking a while for things like image gen, which I found out if you're gonna do with any alacrity, just have a cloud model do it.

For anything else local, including writing some automation scripts and such, it works great.

▲

Zambyte 5 hours ago | parent | next [-]

Can you link the model? I also have a 24gb card (7900 XTX). I've been having success with the dense 27b model, but I'd like to see if the 35b iq4 is any better.

	▲	jboss10 5 hours ago \| parent [-]
		https://unsloth.ai/docs/models/qwen3.6 And https://huggingface.co/collections/unsloth/qwen36

▲

ai_fry_ur_brain 5 hours ago | parent | prev [-]

Whats your example of a "great automation script"?