Remix.run Logo
goldenarm 3 hours ago

gemma4-e4b is 50% better than gemma4-26b in your benchmark, something's wrong

guilamu 3 hours ago | parent [-]

Yes those two models were tested on my own PC (local inference using my own CPU/GPU). So something my be bugged on my setup. gemma4-26b should be far better than gemma4-e4b.

embedding-shape 2 hours ago | parent [-]

Sounds like maybe using worse quantization on the bigger model? Quantization matters a lot for the quality, basically anything below Q8 is borderline unusable. If it isn't specified in a benchmark already it probably should.