Have you tried the dense(27B,9B) Qwen3.5 models? Or any diffusion models (Flux Klein, Zimage)? I'm trying to gauge how much of a perf boost I'd get upgrading from an m3 pro.
For reference:
| model | size | params | backend | threads | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| qwen35 ?B Q5_K - Medium | 6.12 GiB | 8.95 B | MTL,BLAS | 6 | pp512 | 288.90 ± 0.67 |
| qwen35 ?B Q5_K - Medium | 6.12 GiB | 8.95 B | MTL,BLAS | 6 | tg128 | 16.58 ± 0.05 |
| model | size | params | backend | threads | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | MTL,BLAS | 6 | pp512 | 615.94 ± 2.23 |
| gpt-oss 20B MXFP4 MoE | 11.27 GiB | 20.91 B | MTL,BLAS | 6 | tg128 | 42.85 ± 0.61 |
Klein 4B completes a 1024px generation in 72seconds.