| ▲ | delis-thumbs-7e 3 hours ago | |||||||
Wouldn’t that be extremely computationaly expensive considering how resource incentive training is? | ||||||||
| ▲ | colechristensen 3 hours ago | parent [-] | |||||||
No, training a state of the art model involves training on the order of 10 trillion tokens. We're talking about a step that updates weights based on say between 10k and 1M tokens. | ||||||||
| ||||||||