Remix clone Hacker News

new | show | ask | jobs Github

	▲	LeifCarrotson 3 hours ago
		I've also been running Qwen 3.6 35B A3b on my Windows laptop (64 GB RAM, a 4GB GPU) and it's at least tolerable. It's not fast - a few tokens per second, slower than reading speed - but I can give it a task and come back later. That was a $600 laptop off eBay a few years ago, not a $6,000 machine. Are these unified memory Macs and giant 24GB desktop GPUs achieving dozens or hundreds of tokens per second commensurate with their 10x-20x cost?
	▲	jaggederest 4 minutes ago \| parent [-]
		35b A3b runs ~100 tokens a second on the best M5 Max gpu setup.