| ▲ | drnick1 3 hours ago | |
Do you recommend Ollama or bare llama.cpp? | ||
| ▲ | jboss10 2 hours ago | parent | next [-] | |
llama.cpp It's faster and more open source. Ollama has some mixed history. I use llama-swap to emulate the Ollama experience. | ||
| ▲ | shironnnn_ 2 hours ago | parent | prev [-] | |
if on MacOS I recommend llm-mlx which currently renders tokens 10%-15% faster than llama.cpp. | ||