| ▲ | robocat 2 hours ago | |
The AI training data sets are also expensive... The cost is especially hard to estimate for data sets that are internal to businesses like Google. Especially if the model needs to be refreshed to deal with recent data. I presume historical internal datasets remain high value, since they might be cleaner (no slop) or maybe unavailable (copyright takedowns) and companies are getting better at hiding their data from spidering. | ||