Remix.run Logo
Rohansi 13 hours ago

This, and always check benchmarks instead of assuming memory bandwidth is the only possible bottleneck. Apple Silicon definitely does not fully use its advertised memory bandwidth when running LLMs.