Remix clone Hacker News

new | show | ask | jobs Github

	▲	mrob 6 hours ago
		LLM inference is mostly read only, so high-bandwidth flash looks like it could provide huge cost savings over VRAM. It's not yet in commercial products but there are working prototypes already. Previous HN discussion: https://news.ycombinator.com/item?id=46700384
	▲	whosegotit 4 hours ago \| parent [-]
		[dead]