| ▲ | LeifCarrotson 3 hours ago | |
I've also been running Qwen 3.6 35B A3b on my Windows laptop (64 GB RAM, a 4GB GPU) and it's at least tolerable. It's not fast - a few tokens per second, slower than reading speed - but I can give it a task and come back later. That was a $600 laptop off eBay a few years ago, not a $6,000 machine. Are these unified memory Macs and giant 24GB desktop GPUs achieving dozens or hundreds of tokens per second commensurate with their 10x-20x cost? | ||
| ▲ | jaggederest 4 minutes ago | parent [-] | |
35b A3b runs ~100 tokens a second on the best M5 Max gpu setup. | ||