| ▲ | bryanrasmussen 7 hours ago | |||||||
maybe there should be an LLM trained on a corpus of a deletions and cleanup of code. | ||||||||
| ▲ | ashdksnndck 44 minutes ago | parent | next [-] | |||||||
I think this is in the training data since they use commit data from repos, but I imagine code deletions are rarer than they should be in the real data as well. | ||||||||
| ▲ | krackers 5 hours ago | parent | prev [-] | |||||||
I'm guessing there's a very strong prior to "just keep generating more tokens" as opposed to deleting code that needs to be overcome. Maybe this is done already but since every git project comes with its own history, you could take a notable open-source project (like LLVM) and then do RL training against against each individual patch committed. | ||||||||
| ||||||||