| ▲ | rdslw 4 days ago | |
## performance data for token generation using lmstudio - gemma4-31b normal q8 -> 5.1 tok/s - gemma4-31b normal q16 -> 3.7 t/s - gemma4-31b distil q16 -> 3.6 t/s - gemma4-31b distil q8 -> 5.7 tok/s (!) - gemma4-26b-a4b ud q8kxl -> 38 t/s (!) - gemma4-26b-a4b ud q16 -> 12 t/s - gemma4-26b-a4b cl q8 -> 42 t/s (!) - gemma4-26b-a4b cl q16 -> 12 t/s - qwen3.5-35b-a3b-UD@q6_k -> 52 t/s (!) - qwen3.5-35b-a3b-uncensored-hauhaucs-aggressive@q8_0 -> 34 tok/s (!) - qwen3.5-35b-a3b-uncensored-hauhaucs-aggressive@bf16 -> 11 tok/s - qwen3.5-27b-claude-4.6-opus-reasoning-distilled-v2 q8 -> 8 tok/s - qwen3.5 122B A10B MXFP4 Mo qwen3.5-122b-a10b (q4) -> 11 tok/s - qwen3.5-122b-a10b-uncensored-hauhaucs-aggressive (q6) -> 10 tok/s | ||