Remix clone Hacker News

new | show | ask | jobs Github

	▲	pama 4 hours ago
		Parallel inference on large compute scales in superlinear ways. There is no way to beat the reduction in memory transfers that a data-center inference model provides with hardware that fits at anything called a home. It is much more energy efficient to process huge batches of parallel requests compared to having one or a handful of queries running on an accelerator.
	▲	dudefeliciano 3 hours ago \| parent [-]
		Aren't data centers extremely energy inneficient due to network latency, memory bottlenecks and so on? I mean the models that run on them are extremely powerful compared to what you can run on consumer hardware, but I wouldn't call them efficient...