| ▲ | jakozaur 2 hours ago | |
Is it just me, or do LLM code assistants do catastrophically silly things (drop a DB, delete files, wipe a disk, etc.) far more often than humans? It looks like the training data has plenty of those examples, but the models don’t have enough grounding or warnings before doing them. I wish there were a PleaseDontDoAnythingStupidEval for software engineering. | ||