| ▲ | DesaiAshu 3 days ago | |||||||
data bandwidth limits distributed training under current architectures. really interesting implications if we can make progress on that | ||||||||
| ▲ | dogcomplex 2 days ago | parent | next [-] | |||||||
Limits but doesn't prohibit. See https://www.primeintellect.ai/blog/intellect-3 - still useful and can scale enormously. Takes a particular shape and relies heavily on RL, but still big. | ||||||||
| ▲ | andoando 2 days ago | parent | prev [-] | |||||||
What bandwith limits? Im assuming the forward and backward passes have to be done sequentially? | ||||||||
| ||||||||