Remix clone Hacker News

new | show | ask | jobs Github

	▲	moralestapia 2 hours ago
		>HOW NVIDIA GPUs process stuff? (Inefficiency 101) Wow. Massively ignorant take. A modern GPUs is an amazing feat of engineering, particularly about making computation more efficient (low power/high throughput). Then proceeds to explain, wrongly, how inference is supposssedly implemented and draws conclusions from there ...
	▲	wmf an hour ago \| parent \| next [-]
		Arguably DRAM-based GPUs/TPUs are quite inefficient for inference compared to SRAM-based Groq/Cerebras. GPUs are highly optimized but they still lose to different architectures that are better suited for inference.
	▲	beAroundHere 2 hours ago \| parent \| prev [-]
		Hey, Can you please point out explain the inaccuracies in the article? I had written this post to have a higher level understanding of traditional vs Taalas's inference. So it does abstracts lots of things.