| ▲ | rib3ye 9 hours ago | |
How many tokens /sec? | ||
| ▲ | roadside_picnic 8 hours ago | parent [-] | |
M3-Max laptop: ~55 token/sec RTX 4090: ~190 token/sec I don't have the number around but there is a notable latency for pre-fill on the M3, but once it's running the delay is negligible. The RTX, unsurprisingly, is all around superior performance wise, but: I use that computer for gaming and image gen work so I can't dedicate it as a server, and, especially when it's warmer, the heat generated under heavy loads is noticable. | ||