Remix.run Logo
sleepybrett 4 hours ago

or you can just load up ollama, have it load a local model and point claude or opencode at it...

is this article old? It's not. I'm not sure why he went through all the bother of llama.cpp

malkosta 4 hours ago | parent [-]

That was exactly my same question. Then I finished reading the post. The reason is pretty clear, and written in the post: it is faster than ollama+mlx.

sleepybrett 4 hours ago | parent [-]

how much faster?