Remix.run Logo
aurareturn 3 hours ago

Don't buy the Mini or Studio. Both have the M4 which lacks the Neural Accelerators, making prompt processing ~3-4x slower.

mortenjorck 3 hours ago | parent [-]

I assume those don't just work automatically with an off-the-shelf gguf. What do you need in your local inference stack to take advantage of M5's neural accelerators?

aurareturn 3 hours ago | parent [-]

They do work with llama.cpp and MLX automatically.