Remix.run Logo
MLX LM 0.20.1 has the comparable speed as llama.cpp with flash attention(old.reddit.com)
1 points by tosh 8 hours ago | 1 comments
7 hours ago | parent [-]
[deleted]