Remix clone Hacker News

new | show | ask | jobs Github

	▲	FrenchTouch42 2 hours ago
		> time-to-first-token/token-per-second/memory-used/total-time-of-test Would it not help with the DDR4 example though if we had more "real world" tests?
	▲	bigyabai 2 hours ago \| parent [-]
		Maybe, but even that fourth-order metric is missing key performance details like context length and model size/sparsity. The bigger takeaway (IMO) is that there will never really be hardware that scales like Claude or ChatGPT does. I love local AI, but it stresses the fundamental limits of on-device compute.