▲ | mhuffman 3 days ago | |
This particular one may not work on M chips, but the model itself does. I just tested a different sized version of the same model in LM Studio on a Macbook Pro, 64GB M2 Max with 12 cores, just to see. Prompt: Create a solar system simulation in a single self-contained HTML file. qwen3-next-80b (MLX format, 44.86 GB), 4bit 42.56 tok/sec , 2523 tokens, 12.79s to first token - note: looked like ass, simulation broken, didn't work at all. Then as a comparison for a model with a similar size, I tried GLM. GLM-4-32B-0414-8bit (MLX format, 36.66 GB), 9.31 tok/sec, 2936 tokens, 4.77s to first token - note: looked fantastic for a first try, everything worked as expected. Not a fair comparison 4bit vs 8bit but some data. The tok/sec on Mac is pretty good depending on the models you use. |