| ▲ | kristjansson 9 hours ago | ||||||||||||||||||||||||||||
> to make sure you've really got to be careful with absolute language like this in reference to LLMs. A review agent provides no guarantees whatsoever, just shifts the distribution of acceptable responses, hopefully in a direction the user prefers. | |||||||||||||||||||||||||||||
| ▲ | jawiggins 9 hours ago | parent | next [-] | ||||||||||||||||||||||||||||
Fair, it's something like a semantic enforcement rather than a hard one. I think current AI agents are good enough that if you tell it, "Review this PR and request changes anytime a user uses a variable name that is a color", it will do a pretty good job. But for complex things I can still see them falling short. | |||||||||||||||||||||||||||||
| ▲ | SR2Z 6 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||
I mean, having unit tests and not allowing PRs in unless they all pass is pretty easy (or requiring human review to remove a test!). A software engineer takes a spec which "shifts the distribution of acceptable responses" for their output. If they're 100% accurate (snort), how good does an LLM have to be for you to accept its review as reasonable? | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||