Remix clone Hacker News

new | show | ask | jobs Github

	▲	wmf 2 hours ago
		Arguably DRAM-based GPUs/TPUs are quite inefficient for inference compared to SRAM-based Groq/Cerebras. GPUs are highly optimized but they still lose to different architectures that are better suited for inference.