Remix.run Logo
reilly3000 5 days ago

Presumably this is about adding more memory channels via pcie lanes. I’m very curious to know what kind of bandwidth one could expect with such a setup, as that is the primary bottleneck for inference speed.

Dylan16807 5 days ago | parent [-]

The raw speed of PCIe 5.0 x16 is 63 billion bytes per second each way. Assuming we transfer several cache lines at a time the overhead should be pretty small, so expect 50-60GB/s. Which is on par with a single high-clocked channel of DRAM.