Remix clone Hacker News

new | show | ask | jobs Github

	▲	williamdclt 3 days ago
		I don't really know what I'm talking about (whether about graphic cards or in AI inference), but if someone figures out how to cut the compute needed for AI inference significantly then I'd guess the demand for graphic cards would suddenly drop? Given how young and volatile this domain still is, it doesn't seem unreasonable to be wary of it. Big players (google, openai and the likes) are probably pouring tons of money into trying to do exactly that
	▲	rtrgrd 3 days ago \| parent [-]
		I would suspect that for self hosted LLMs, quality >>> performance, so the newer releases will always expand to fill capacity of available hardware even when efficiency is improved.