| ▲ | kelnos 3 hours ago | |
Is the issue that training with less compute takes more time? Or is it just not possible? I think a collective using distributed training could tolerate the idea that it takes 10x as long as Anthropic to train a model, or whatever. | ||