This is Mac Studio M1 Ultra with 128Gb of RAM.
> llama-bench -m ./gpt-oss-120b-MXFP4-00001-of-00002.gguf -ngl 999 -fa 1 --mmap 0 -p 65536 -b 4096 -ub 4096
| model | size | params | backend | threads | n_batch | n_ubatch | fa | mmap | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | ------: | -------: | -: | ---: | --------------: | -------------------: |
| gpt-oss 120B MXFP4 MoE | 59.02 GiB | 116.83 B | Metal,BLAS | 16 | 4096 | 4096 | 1 | 0 | pp65536 | 392.37 ± 43.91 |
| gpt-oss 120B MXFP4 MoE | 59.02 GiB | 116.83 B | Metal,BLAS | 16 | 4096 | 4096 | 1 | 0 | tg128 | 65.47 ± 0.08 |
build: a0e13dcb (6470)