Remix.run Logo
cbg0 6 hours ago

It should have the same flow as reviewing PRs from humans.

t43562 6 hours ago | parent | next [-]

Who really truly enjoys that and doesn't see it as a chore?

I find the real way to review other people's code is to program with it and then I start seeing where the problems are much more clearly. I would do a review and spot nothing important then start working on my own follow-on change and immediately run into issues.

sampullman 5 hours ago | parent | next [-]

I usually don't mind, but tend to split reviews into two types. Either I understand the context and can quickly do an in depth review, or I have to take some time to actually learn about the code by reviewing the surrounding systems, experimenting with it, etc. But in both cases I would at least run the code and verify correctness.

I think it becomes a chore when there are too many trivial mistakes, and you feel like your time would have been better spent writing it yourself. As models and agent frameworks improve I see this happening less and less.

cbg0 5 hours ago | parent | prev [-]

> Who really truly enjoys that and doesn't see it as a chore?

This is a whole different discussion, but I just see it as part of the job that I'm getting paid for, I don't need to enjoy it to do it.

Functional testing is a must now that writing tests is also automated away by LLMs as you can get a better understanding if it does what it says on the box, but there will still be a lot of hidden gotchas if you're not even looking at the code.

Plenty of LLM-written code runs excellent until it doesn't, though we see this with human written code too, so it's more about investing more time in the hopes of spotting problems before they become problems.

t43562 5 hours ago | parent [-]

> Functional testing is a must now that writing tests is also automated away by LLMs as you can get a better understanding if it does what it says on the box, but there will still be a lot of hidden gotchas if you're not even looking at the code.

Well, there you go. Letting AI write the tests is a mistake IMO. When I'm working with other people I write tests too and when I see their tests I know what they're missing out because I know the system and the existing tests. Sometimes I see the problem in their tests when I'm working on some of my own. If you absent yourself from that process then ....

fg137 4 hours ago | parent | prev | next [-]

Which is a really, really bad idea.

Most people don't spend nearly enough time going through a code review. They certainly don't think as hard as needed to question the implementation or come up with all the edge cases. It's active vs passive thinking.

I, for one, have found numerous issues in other people's code that makes me wonder, "would they have ever made such a mistake if they hand coded this?"

btw, a side effect is that nobody really understands the codebase. People just leave it to AI to explain what code does. Which is of course helpful for onboarding but concerning for complex issues or long term maintenance.

microtonal 6 hours ago | parent | prev [-]

The problem is the LLMs completely change the equation. Before LLMs, beyond very junior (needs serious coaching) levels, reviewing was typically faster than writing the code that was reviewed. With LLMs, writing code is orders of magnitude faster than reviewing it. We already see open source projects getting buried in LLM slop and you have to find the real human or at least carefully curated contributions among the slop.

I would not be surprised if many open source projects will outright stop taking PRs. I have had the same feeling several times - if I'm communicating with an LLM through the GitHub PR interface, I'd rather just directly talk to an LLM myself.

But ending PRs is going to be painful for acquiring new contributors and training more junior people. Hopefully the tooling will evolve. E.g. I'd love have a system where someone has to open an issue with a plan first and by approving you could give them a 'ticket' to open a single PR for that issue. Though I would be surprised if GitHub and others would create features that are essentially there to rein in Copilot etc.