| ▲ | adastra22 2 days ago | |
Again, memory bandwidth is pretty much all that matters here. During inference or training the CUDA cores of retail GPUs are like 15% utilized. | ||
| ▲ | my123 a day ago | parent | next [-] | |
Not for prompt processing. Current Macs are really not great at long contexts | ||
| ▲ | 2 days ago | parent | prev [-] | |
| [deleted] | ||