▲ | slickytail 7 days ago | |
The memory bandwidth on an H100 is 3TB/s, for reference. This number is the limiting factor in the size of modern LLMs. 100GB/s isn't even in the realm of viability. | ||
▲ | torginus 7 days ago | parent | next [-] | |
That bandwidth is for the whole GPU, which has 6 mermoy chips. But anyways, what I'm proposing isn't for the high-end and training, but for making inference cheap. And I was somehat conservative with the numbers, a modern budget SSD with a single NAND can do more than 5GB/s read speed. | ||
▲ | torginus 7 days ago | parent | prev [-] | |
That bandwidth is for the whole GPU, which has 6 chips. But anyways, what I'm proposing isn't for the high-end and training, but for making inference cheap. And I was somehat conservative with the numbers, a modern budget SSD with a single NAND can do more than 5GB/s read speed. |