Remix.run Logo
verdverm 10 hours ago

Look into updating to Gemma4 and Qwen3.6, they are good at agentic things. qwen36moe with unsloth's 8bit quant is my daily driver now.

nateb2022 9 hours ago | parent [-]

Have you noticed a gap between 8bit and 4bit quant? I've always ran 4bit quant cause less memory required

verdverm 6 hours ago | parent [-]

I run the biggest quant because it is more capable, spark has enough memory for two qwen at 8bit and full context length (roughly 48G each)

I find gemini/gemma to have become worse at coding, they are better for non-coding tasks, but maybe not even that, the hallucinations and instruction following have both degraded ime