Remix clone Hacker News

new | show | ask | jobs Github

	▲	aurareturn 4 hours ago
		It’s just dumb to think that one chip per model is their plan. They stated that their plan is to chain multiple chips together. I was indeed wrong about 10 chips. I thought they would use llama 8B 16bit and a few thousand context size. It turns out, they used llama 8B 3bit with around 1k context size. That made me assume they must have chained multiple chips together since the max SRAM on TSMC n6 for reticle sized chip is only around 3GB.