| ▲ | xdavidliu 11 hours ago | |||||||
open source code is a miniscule fraction of the training data | ||||||||
| ▲ | TheCraiggers 10 hours ago | parent | next [-] | |||||||
I'd love to see a citation there. We already know from a few years ago that they were training AI based on projects on GitHub. Meanwhile, I highly doubt software firms were lining up to have their proprietary code bases ingested by AI for training purposes. Even with NDAs, we would have heard something about it. | ||||||||
| ||||||||
| ▲ | maplethorpe 11 hours ago | parent | prev [-] | |||||||
Where did most of the code in their training data come from? | ||||||||