Remix.run Logo
anonymousDan 2 days ago

Can someone tell me if the challenges the article describes and indeed the frameworks they mention are mostly relevant for training or also for inference?

benreesman 2 days ago | parent [-]

The fast interconnect between nodes has aaplications in inference at scale (big KV caches and other semi-durable state, multi-node tensor parallelism on mega models).

But this article in particular is emphasizing extreme performance ambitions for columnar data processing with hardware acceleration. Relevant to many ML training scenarios, but also other kinds of massive MapReduce-style (or at least scale) workloads. There are lots of applications of "magic massive petabyte plus DataFrame" (which is not I think solved in the general case).