| ▲ | rbanffy 3 hours ago | |
Can’t you make bandwidth reservations and optimise data location to prefer comms between directly connected nodes over one or two-hop paths? | ||
| ▲ | KeplerBoy 3 hours ago | parent [-] | |
Sure, one could think of some kind of pipeline parallelism where you only need a fast transfer to the next step in the model and that would boost throughput but not increase model size. | ||