Remix.run Logo
drnick1 3 hours ago

Do you recommend Ollama or bare llama.cpp?

jboss10 2 hours ago | parent | next [-]

llama.cpp It's faster and more open source. Ollama has some mixed history. I use llama-swap to emulate the Ollama experience.

shironnnn_ 2 hours ago | parent | prev [-]

if on MacOS I recommend llm-mlx which currently renders tokens 10%-15% faster than llama.cpp.