Have you tried the dense(27B,9B) Qwen3.5 models? Or any diffusion models (Flux Klein, Zimage)? I'm trying to gauge how much of a perf boost I'd get upgrading from an m3 pro.

For reference:

  | model                          |       size |     params | backend    | threads |            test |                  t/s |
  | ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
  | qwen35 ?B Q5_K - Medium        |   6.12 GiB |     8.95 B | MTL,BLAS   |       6 |           pp512 |        288.90 ± 0.67 |
  | qwen35 ?B Q5_K - Medium        |   6.12 GiB |     8.95 B | MTL,BLAS   |       6 |           tg128 |         16.58 ± 0.05 |

  | model                          |       size |     params | backend    | threads |            test |                  t/s |
  | ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
  | gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | MTL,BLAS   |       6 |           pp512 |        615.94 ± 2.23 |
  | gpt-oss 20B MXFP4 MoE          |  11.27 GiB |    20.91 B | MTL,BLAS   |       6 |           tg128 |         42.85 ± 0.61 |

  Klein 4B completes a 1024px generation in 72seconds.