| ▲ | greenavocado 7 hours ago | |
So distribute copies of the model in RAM to multiple machines, have each machine update different parts of the model weights, and sync updates over the network | ||
| ▲ | olliepro 4 hours ago | parent [-] | |
decentralized training makes a lot more sense when the required hardware isn't a $40K GPU... | ||