It's not implemented in mlx[1] yet (or llama.cpp[2]), so it may take a while.
[1] https://github.com/ml-explore/mlx-lm/pull/990
[2] https://github.com/ggml-org/llama.cpp/pull/22673