My experience also aligns with this. I'm running gemma4 31B on a 4090 through llm.cpp with unsloth models. I also run Qwen 3.6. Qwen is good for thinking and planning as it is faster, but Gemma4's generated code is much higher quality in the first try (Rust, C++ and C#). so it needs less revisions to be at a level I'm comfortable for merging.

▲

beastman82 6 hours ago | parent | next [-]

I second unsloth models. I'm using them over blackwell-oriented nvfp4 models as they are (empirically) top quality and performance.

	▲	kroaton 2 hours ago \| parent [-]
		NVFP4 will be better if the model provider actually post-trained properly after quantizing.

▲

6 hours ago | parent | prev [-]

[deleted]