Remix clone Hacker News

new | show | ask | jobs Github

	▲	adastra22 2 days ago
		Again, memory bandwidth is pretty much all that matters here. During inference or training the CUDA cores of retail GPUs are like 15% utilized.
	▲	my123 a day ago \| parent \| next [-]
		Not for prompt processing. Current Macs are really not great at long contexts
	▲	2 days ago \| parent \| prev [-]
		[deleted]