Remix clone Hacker News

new | show | ask | jobs Github

	▲	zozbot234 2 days ago
		Ollama or llama.cpp are also common alternatives. But a 8B model isn't going to have much real-world knowledge or be highly reliable for agentic workloads, so it makes sense that people will want more than that.
	▲	zach_vantio 2 days ago \| parent [-]
		the compute density is insane. but giving a 70B model actual write access locally for agentic workloads is a massive liability. they still hallucinate too much. raw compute without strict state control is basically just a blast radius waiting to happen.