| ▲ | 59nadir 4 hours ago | |
We've seen public examples of where LLMs literally disable or remove tests in order to pass. I'm not sure having tests and asking LLMs to not merge things before passing them being "easy" matters much when the failure modes here are so plentiful and broad in nature. | ||
| ▲ | ElFitz 2 hours ago | parent | next [-] | |
My favourite so far was Claude "fixing" deployment checks with `continue-on-error: true` | ||
| ▲ | jamiemallers 14 minutes ago | parent | prev | next [-] | |
[dead] | ||
| ▲ | AbanoubRodolf 3 hours ago | parent | prev [-] | |
[dead] | ||