Remix clone Hacker News

new | show | ask | jobs Github

	▲	drnick1 3 hours ago
		Do you recommend Ollama or bare llama.cpp?
	▲	jboss10 2 hours ago \| parent \| next [-]
		llama.cpp It's faster and more open source. Ollama has some mixed history. I use llama-swap to emulate the Ollama experience.
	▲	shironnnn_ 2 hours ago \| parent \| prev [-]
		if on MacOS I recommend llm-mlx which currently renders tokens 10%-15% faster than llama.cpp.