Remix.run Logo
ppeetteerr 3 days ago

This is not unique to the age of LLMs. PR reviews are often shallow because the reviewer is not giving the contribution the amount of attention and understanding it deserves.

With LLMs, the volume of code has only gotten larger but those same LLMs can help review the code being written. The current code review agents are surprisingly good at catching errors. Better than most reviewers.

We'll soon get to a point where it's no longer necessary to review code, either by the LLM prompter, or by a second reviewer (the volume of generate code will be too great). Instead, we'll need to create new tools and guardrails to ensure that whatever is written is done in a sustainable way.

gyomu 3 days ago | parent | next [-]

> We'll soon get to a point where it's no longer necessary to review code, either by the LLM prompter, or by a second reviewer (the volume of generate code will be too great). Instead, we'll need to create new tools and guardrails to ensure that whatever is written is done in a sustainable way.

The real breakthrough would be finding a way to not even do things that don’t need to be done in the first place.

90% of what management thinks it wants gets discarded/completely upended a few days/weeks/months later anyway, so we should have AI agents that just say “nah, actually you won’t need that” to 90% of our requests.

3 days ago | parent [-]
[deleted]
Bukhmanizer 3 days ago | parent | prev | next [-]

> We'll soon get to a point where it's no longer necessary to review code, either by the LLM prompter, or by a second reviewer (the volume of generate code will be too great). Instead, we'll need to create new tools and guardrails to ensure that whatever is written is done in a sustainable way.

This seems silly to me. In most cases, the least amount of work you can possibly do is logically describe the process you want and the boundaries, and run that logic over the input data. In other words, coding.

The idea that we should, to avoid coding or reading code, come up with a whole new process to keep generated code on track - would almost certainly take more effort than just getting the logical incantations correct the first time.

foxfired 3 days ago | parent | prev | next [-]

One thing to take into account is that PR reviews aren't there for just catching errors in the code. They also ensure that the business logic is correct. For example, you can have code that pass all tests, and look good, but they don't align with the business logic.

vjvjvjvjghv 3 days ago | parent | prev | next [-]

“ With LLMs, the volume of code has only gotten larger ”

It’s even worse with offshore devs. They produce a ton of code you have to review every morning.

prybeng 3 days ago | parent | prev | next [-]

I wonder if the paradigm shift is the adoption of a higher level language. Akin to what python did to blackboxing C libraries.

Ianjit 2 days ago | parent [-]

I'm not a programmer but I always had the impression that different languages were appropriate for different tasks. My question is, "For what type of programming tasks is English the correct level of abstraction?"

fzeroracer 3 days ago | parent | prev [-]

Can you define what an "error" is?

ppeetteerr 3 days ago | parent [-]

Logic error, for instance

fzeroracer 3 days ago | parent [-]

Well, it depends on the logic error doesn't it? And it depends on how the system is intended to behave. A method that does 2+2=5 is a logic error, but it could be a load-bearing method in the system that blows up when changed to be correct.

Something like blowing up the stack or going out of bounds is more obviously a bug, but detecting those will often require inferences from how a code behaves during runtime to identify. LLMs might work for detecting the most basic cases because those appear most often in their data set, but whenever I see people suggest that they're good at reviewing I think it's from people that don't deeply review code.