| ▲ | jmyeet 4 hours ago | |||||||
I didn't see this in the article but elsewhere I've seen the memory bandwidth quoted as 600GB/s [1]. For comparison: - 5090/6000 Pro: 1792GB/s - 5080:: 960GB/s - 5070Ti: 892GB/s - M3 Ultra: 819GB/s - DGX Spark: 273GB/s (less than an M5 Pro at 307GB/s) Memory bandwidth isn't everything but it will cap inference rate pretty heavily. Also, the M3 Ultra is for an almost 2 year old Mac Studio. It's widely expected that it'll be refreshed in Q3 with a likely M5 or M4 Ultra with >1000GB/s. I really hope Apple realizes what a market opportunity Apple has here. The above shows just how good value the 5090 really is. It basically a RTX 6000 Pro with less RAM (and ~12% fewer CUDA units), which is a ~$10k card, for 20-30% of the price. This also demonstrates how NVidia uses VRAM for market segmentation. As an aside, the true data center cards (eg B100, H100) use HBM memory at ~3.2TB/s. [1]: https://wccftech.com/nvidia-enters-pc-space-with-rtx-spark/ | ||||||||
| ▲ | wmf 3 hours ago | parent | next [-] | |||||||
Spark memory bandwidth is ~300 GB/s. Internal bandwidth is 600 GB/s but that doesn't matter. | ||||||||
| ▲ | dist-epoch 3 hours ago | parent | prev | next [-] | |||||||
128 GB at 600 GB/s for this versus 32 GB at 1800 GB/s for 5090. This is much better value than 5090, you can run much bigger models. | ||||||||
| ||||||||
| ▲ | MrBuddyCasino 4 hours ago | parent | prev [-] | |||||||
Yeah and also the quoted 1 PF is only for sparse models (only half that for dense, if that), and the DGX had serious hardware issues: https://x.com/ID_AA_Carmack/status/1982831774850748825 | ||||||||