Remix clone Hacker News

new | show | ask | jobs Github

	▲	zargon an hour ago
		Yes, definitely it's the bottleneck for most use cases besides "chatting". It's the reason I have never bought a Mac for LLM purposes. It's frustrating when trying to find benchmarks because almost everyone gives decode speed without mentioning prefill speed.