Remix clone Hacker News

new | show | ask | jobs Github

	▲	regularfry 2 days ago
		More tokens in the context means disproportionately more VRAM, to the extent that you really do need multiple GPUs if you're running an interestingly-sized model.