Remix.run Logo
adastra22 2 days ago

Again, memory bandwidth is pretty much all that matters here. During inference or training the CUDA cores of retail GPUs are like 15% utilized.

my123 a day ago | parent | next [-]

Not for prompt processing. Current Macs are really not great at long contexts

2 days ago | parent | prev [-]
[deleted]