Remix.run Logo
lmeyerov 2 hours ago

This is exactly what I was wondering

I gave a talk a few years ago at dask summit (conf?) on making the stars align with dask-cudf here. We were helping a customer accelerate log analytics by proving out our stack for nodes that look roughly like: parallel ssd storage arrays (30 x 3 GB/s?) -> GPUDirect Storage -> 4 x 30 GB/s PCIe (?) -> 8 x A100 GPUs, something like that. It'd be cool to see the same thing now in the LLM world, such as a multi-GPU MoE, or even a single-GPU one for that matter!