Remix clone Hacker News

new | show | ask | jobs Github

	▲	cyanydeez 3 hours ago
		not at the vram sizes that control how much context to load; also, GPUs arn't as effiecient as direct inference.
	▲	wmf 25 minutes ago \| parent [-]
		OK, B70.