Why are you using Ollama? Just use llama.cpp
brew install llama.cpp
use the inbuilt CLI, Server or Chat interface. + Hook it up to any other app
For MLX I'd guess.
That also comes upstream from llama.cpp https://github.com/ggml-org/llama.cpp/discussions/4345
https://omlx.ai/