Remix.run Logo
mvanbaak 5 hours ago

I still dont get the idea about AI code reviews. A code review (at least in my opinion) is for your peers to check if the changes will have a positive or negative effect on the overall code + architecture. I have yet to see an LLM being good at this.

Sure, they will leave comments about common made errors (your editor should already warn about this before you even commit it) etc. But to notify about this weird thing that was done to make sure something a lot of customers wanted is made reality.

also, PR's are created to share knowledge. Questions and answers on them are to spread knowledge in the team. AI does not do that.

[edit] Added the part about knowledge sharing

simonw 5 hours ago | parent | next [-]

Sure, AI code reviews aren't a replacement for an architecture review on a larger team project.

But they're fantastic at spotting dumb mistakes or low-hanging fruit for improvements!

And having the AI spot those for you first means you don't waste your team's valuable reviewing time on the simple stuff that you could have caught early.

emeraldd 4 hours ago | parent | next [-]

My experience with AI code reviews has been very mixed and more on the negative side than the positive one. In particular, I've had to disable the AI reviewer on some projects my team manages because it was so chatty that it caused meaningful notifications from team members to be missed.

In most of the repos I work with, it tends to make a large number of false positive or inappropriate suggestions that are just plain wrong for the code base in question. Sometimes these might be ok in some settings, but are generally just wrong. About 1 in every 10~20 comments is actually useful or something novel that hasn't been caught elsewhere etc. The net effect is that the AI reviewer we're effectively forced to use is just noise that get's ignored because it's so wrong so often.

fusslo 3 hours ago | parent | next [-]

one person proved the uselessness of ai reviews for our entire company.

He'd make giant, 100+ file changes, 1000+ worded PRs. Impossible to review. eventually he just modified the permissions to require a single approval, approves his changes and merges. This is still going on, but he's isolated to repos he made himself

He'd copy/paste the output from AI on other people's reviews. Often they were false positives or open ended questions. So he automated his side, but doubled or tripled the work of the person requesting the review. not to mention the ai's comments were 100-300 words with formatting and emojis.

The contractors refused to address any comments made by him. Some felt it was massively disrespectful as they put tons of time and effort into their changes and he can't even bother to read it himself.

It got to the CTO. And AI reviews have been banned.

But it HAS helped the one Jr guy on the team prepare for reviews and understand review comments better. It's also helped us write better comments, since I and some others can be really bad at explaining something

syntheticcdo 3 hours ago | parent | prev | next [-]

Sometimes the only review a PR needs is "LGTM" - something today's LLMs are structurally incapable of.

CuriouslyC 40 minutes ago | parent [-]

The solution to this is to get agents to output an importance scale (0-1) then just filter anything below whatever threshold you prefer before creating comments/change reqs.

insin 2 hours ago | parent | prev [-]

I love having to hit Resolve Conversation umpteen times before I can merge because somebody added Copilot and it added that many dumb questions/suggestions

mvanbaak 5 hours ago | parent | prev [-]

those AI checks, if you insist in getting them, should be part of your pre-commit, not part of your PR review flow. they are at best (if they even reach this level) as good as a local run of a linter or static type checker If you are running them as a PR check, the PR is out there. So people will spend time on that PR. no matter if you are fixing the AI comments or not. Best to fix those things BEFORE you provide your code to the team.

[edit] Added part about wasting your teams time

nnutter 4 hours ago | parent | next [-]

My team uses draft PRs and goes through a process, including AI review, before removing the draft status thereby triggering any remaining human review.

A PR is also a decent UI for getting the feedback but especially so for documenting/discussing the AI review suggestions with the team, just like human review.

AI review is also not equivalent to linter and static checks. It can suggest practices appropriate for the language and appropriate for your code base. Like a lot of my AI experiences it's pretty hit or miss and it's non-deterministic but it doesn't have much cost to disregard the misses and I appreciate the hits.

aidanlister 4 hours ago | parent | prev | next [-]

This just sounds like you haven’t worked in a team environment in the last 12 months.

The ergonomics of doing this in pre-commit make no sense.

Spin up a PR in GitHub and get Cursor and/or Claude to do a code review — it’s amazing.

It’ll often spot bugs (not only obvious ones), it’ll utilise your agent.md to spot mismatched coding style, missing documentation, it’ll check sentry to see if this part of the code touches a hotspot or a LOC that’s been throwing off errors … it’s an amazing first pass.

Once all the issues are resolved you can mark the PR as ready for review and get a human to look big picture.

It’s unquestionably a huge time saver for reviewers.

And having the AI and human review take place with the same UX (comments attached to lines of code, being able to chat to the AI to explain decisions, having the AI resolve the comment when satisfied) just makes sense and is an obvious time saver for the submitter.

Sharlin 4 hours ago | parent | next [-]

Stuff like coding style and missing documentation is what your basic dumb formatter and linter are supposed to do, using a LLM for such things is hilarious overkill and waste of electricity.

gerad 2 hours ago | parent [-]

Your linter can tell if a comment exists. AI can tell if it’s up to date.

mvanbaak 4 hours ago | parent | prev | next [-]

why not have AI review your code BEFORE you share it with the team ? that shows so much more respect to the rest of the team then just throwing your code into the wild, only to change it because some robot tells you that X could be Y

wakawaka28 4 hours ago | parent | prev [-]

It makes as much sense to use AI in pre-commit as it does to use a linter.

tokioyoyo 4 hours ago | parent | prev | next [-]

We have AI code reviews enabled for some PR reviews and we discuss them from time to time on the PR to see if it’s worth doing it.

simonw 5 hours ago | parent | prev [-]

I completely agree.

CuriouslyC 42 minutes ago | parent | prev | next [-]

If you have an architecture document, readmes for related services, relevant code from related services and such assembled for a LLM, it can do a pretty solid review even on microservices. It can catch parameter mismatches/edge cases, instrument logging end to end, do some reasonable flow modeling, etc. It can also point out when uncovered code is a risk, and do a sanity check on tests.

In order to be time efficient, human review should focus on the 'what' rather than the 'how' in most cases.

bilalq 3 hours ago | parent | prev | next [-]

This question is surprising to me, because I consider AI code review the single most valuable aspect of AI-assisted software development today. It's ahead of line/next-edit tab completion, agentic task completion, etc.

AI code review does not replace human review. But AI reviewers will often notice little things that a human may miss. Sometimes the things they flag are false positives, but it's still worth checking in on them. If even one logical error or edge case gets caught by an AI reviewer that would've otherwise made it to production with just human review, it's a win.

Some AI reviewers will also factor in context of related files not visible in the diff. Humans can do this, but it's time consuming, and many don't.

AI reviews are also a great place to put "lint" like rules that would be complicated to express in standard linting tools like Eslint.

We currently run 3-4 AI reviewers on our PRs. The biggest problem I run into is outdated knowledge. We've had AI reviewers leave comments based on limitations of DynamoDB or whatever that haven't been true for the last year or two. And of course it feels tedious when 3 bots all leave similar comments on the same line, but even that is useful as reinforcement of a signal.

5 hours ago | parent | prev [-]
[deleted]