| ▲ | julianlam 5 hours ago | |
> Considering they trained their model on open-source software, the least they could do is give it to open-source maintainers for free with no time limit. Why? The resulting code generated by Claude is unfit for training, so any work product produced after the start of the subsidized program should be ignored. Therefore it makes sense to charge them for the service after 6 months, no? Heh. | ||
| ▲ | lambda 4 hours ago | parent [-] | |
What do you mean it's unfit for training? It's a form of reinforcement learning; the end result has been selected based on whether it actually solved the need. You need to be careful of the amount of reinforcement learning vs continued pretraining you do, but they already do plenty of other forms of reinforcement learning, I'm sure they have it dialed in. | ||