▲ | pbd 7 days ago | |
i agree. this is very gray imo. e.g., books in India have cheap EEE editions compared to the ones in US/Europe. so they can pre-process the data in India & then compile it in US. does that save them from piracy rules & reduces cost as well. | ||
▲ | BoorishBears 7 days ago | parent [-] | |
I mean relative to the cost of pre-training, books are going to be cheap even if you buy them in the US (as demonstrated by the fact Anthropic bought them after) For post-training, other data sources (like human feedback and/or examples) are way more expensive than books |