Remix.run Logo
froh an hour ago

> GPUs are extremely underutilized if you launch just 1 generation stream

why is that? b/c the thing is waiting for the hoooman and idling? or some parallelizable interleaving steps?

I have no intuition yet how this works under the hood.