Remix.run Logo
mfa1999 11 hours ago

How does this compare to llama.cpp in terms of performance?

solarkraft 10 hours ago | parent [-]

MLX is a bit faster (low double digit percentage), but uses a bit more RAM. Worthwhile tradeoff for many.

ysleepy 8 hours ago | parent [-]

On my M4 Pro MLX has almost 2x tok/s