| ▲ | brcmthrowaway 11 hours ago |
| What is the difference between Ollama, llama.cpp, ggml and gguf? |
|
| ▲ | benob 11 hours ago | parent | next [-] |
| Ollama is a user-friendly UI for LLM inference. It is powered by llama.cpp (or a fork of it) which is more power-user oriented and requires command-line wrangling. GGML is the math library behind llama.cpp and GGUF is the associated file format used for storing LLM weights. |
| |
| ▲ | redmalang 10 hours ago | parent [-] | | i've found llama.cpp (as i understand it, ollama now uses their own version of this) to work much better in practice, faster and much more flexible. |
|
|
| ▲ | xiconfjs 11 hours ago | parent | prev [-] |
| Ollama on MacOS is a one-click solution with stable obe-click updates. Happy so far. But the mlx support was the only missing piece for me. |
| |
| ▲ | yard2010 9 hours ago | parent [-] | | Can you please write about your hardware? | | |
| ▲ | xiconfjs 2 hours ago | parent [-] | | * macOS 26.x on MacBookPro M1 Max 32GB
* Ollama on macOS, cursor to play around
* Open WebUI [1] on my Homeserver via API to Ollama (also for remote „A.I.“ access)
* running gpt-oss:20b, qwen3.5:9b with ease, qwen3.5:27b for more complex tasks [1] https://github.com/open-webui/open-webui | | |
| ▲ | brcmthrowaway an hour ago | parent [-] | | Seems complicated. Switch to LMStudio | | |
| ▲ | xiconfjs 25 minutes ago | parent [-] | | I tried man times but at least with its API active, LMStudio has some kind of memory leaks which will slow down the whole system (after ~1-2 days of uptime) even after unloading the model and stopping LMStudio up to a point where even playing a 1080p video results in frame drops. No such issues with Ollama. |
|
|
|
|