| ▲ | whackernews 9 hours ago |
| Oh does llama.cpp use MLX or whatever? I had this question, wonder if you know? A search suggests it doesn’t but I don’t really understand. |
|
| ▲ | irusensei 8 hours ago | parent | next [-] |
| >Oh does llama.cpp use MLX or whatever? No. It runs on MacOS but uses Metal instead of MLX. |
| |
| ▲ | zozbot234 8 hours ago | parent | next [-] | | ANE-powered inference (at least for prefill, which is a key bottleneck on pre-M5 platforms) is also in the works, per https://github.com/ggml-org/llama.cpp/issues/10453#issuecomm... | |
| ▲ | OkGoDoIt 8 hours ago | parent | prev [-] | | Is that better or worse? | | |
| ▲ | irusensei 6 hours ago | parent [-] | | Depends. MLX is faster because it has better integration with Apple hardware. On the other hand GGUF is a far more popular format so there will be more programs and model variety. So its kinda like having a very specific diet that you swear is better for you but you can only order food from a few restaurants. | | |
| ▲ | drob518 5 hours ago | parent [-] | | But you can always fall back to GGUF while waiting for the world to build a few more MLX restaurants. Or something like that; the analogy is a bit stretched. |
|
|
|
|
| ▲ | LoganDark 8 hours ago | parent | prev [-] |
| llama.cpp uses GGML which uses Metal directly. |