I got tired of repeating the same points and having to dig up sources every time, so here's the timeline (as I know it) in one place with sources.

▲

brabel 6 hours ago | parent | next [-]

Thanks for writing this, I hope people here will actually read this and not assume this is some unfounded hit piece. I was involved a little bit in llama.cpp and knew most of what you wrote and it’s just disgusting how ollama founders behaved! For people looking for alternatives, I would also recommend llama-file, it’s a one file executable for any OS that includes your chosen model: https://github.com/mozilla-ai/llamafile?tab=readme-ov-file

It’s truly open source, backed by Mozilla, openly uses llama.cpp and was created by wizard Justine Tunney of CosmopolitanC fame.

	▲	cachius 5 hours ago \| parent [-]
		I also thought llamafile deserves a mention. Once you have all model params and tunings done bakes 'em into a single portable binary!

▲

julien_c 2 hours ago | parent | prev | next [-]

> Ollama eventually added ollama run hf.co/{repo}:{quant} to pull directly from Hugging Face, which partially addresses the availability problem.

uh actually, _we_ did (generates a Docker-style manifest on the fly)

▲

Mario9382 5 hours ago | parent | prev | next [-]

Really nice. I wasn't aware of any of this.

▲

kelsolaar 6 hours ago | parent | prev | next [-]

Great writing, thanks for the summary and timeline.

▲

robot-wrangler 6 hours ago | parent | prev [-]

Thanks, did not know any of this.