| ▲ | its-kostya 5 days ago |
| Code review is part of the job, but one of the least enjoyable parts. Developers like _writing_ and that gives the most job satisfaction. AI tools are helpful, but inherently increases the amount of code we have to review with more scrutiny than my colleagues because of how unpredictable - yet convincing - it can be. Why did we create tools that do the fun part and increase the non-fun part? Where are the "code-review" agents at? |
|
| ▲ | jmcodes 5 days ago | parent | next [-] |
| Maybe I'm weird but I don't actually enjoy the act of _writing_ code. I enjoy problem solving and creating something. I enjoy decomposing systems and putting them back together in a better state, but actually manually typing out code isn't something I enjoy. When I use an LLM to code I feel like I can go from idea to something I can work with in much less time than I would have normally. Our codebase is more type-safe, better documented, and it's much easier to refactor messy code into the intended architecture. Maybe I just have lower expectations of what these things can do but I don't expect it to problem solve. I expect it to be decent at gathering relevant context for me, at taking existing patterns and re-applying them to a different situation, and at letting me talk shit to it while I figure out what actually needs to be done. I especially expect it to allow me to be lazy and not have to manually type out all of that code across different files when it can just generate them it in a few seconds and I can review each change as it happens. |
| |
| ▲ | legacynl 2 days ago | parent | next [-] | | If natural language was an efficient way to write software we would have done it already. Fact is that it's faster to write class X { etc }; Than it is to write "create a class named X with behavior etc". If you want to think and solve problems yourself, it doesn't make sense to then increase your workload by putting your thoughts in natural language, which will be more verbose. I therefore think it makes the most sense to just feed it requirements and issues, and telling it to provide a solution. Also unless you're starting a new project or big feature with a lot of boiler plate, in my experience it's almost never necessary to make a lot of files with a lot of text in it at once. | |
| ▲ | kiitos 4 days ago | parent | prev | next [-] | | the time spent literally typing code into an editor is never the bottleneck in any competently-run project if the act of writing code is something you consider a burden rather than a joy then my friend you are in the wrong profession | | |
| ▲ | jmcodes 4 days ago | parent | next [-] | | Been doing it for ten years still love the profession as much if not more than when I started, but the joy of software development for me was always in seeing my idea come to life, in exploring all the clever ways people had solved so many problems, in trying to become as good at the craft as they were, and in sharing those solutions and ideas with like-minded peers. I care deeply about the code quality that goes into the projects I work on because I end up having to maintain it, review it, or fix it when it goes south, and honestly it just feels wrong to me to see bad code. But literally typing out the characters that make up the code? I could care less. I've done that already. I can do it in my sleep, there's no challenge. At this stage in my career I'm looking for ways to take the experience I have and upskill my teams using it. I'd be crazy not to try and leverage LLMs as much as possible. That includes spending the time to write good CLAUDE.md files, set up custom agents that work with our codebase and patterns, it also includes taking the time to explain the why behind those choices to the team so they understand them, calling out bad PRs that "work" but are AI slop and teaching them how to get better results out of these things. Idk man the profession is pretty big and creating software is still just as fun as when I was doing it character by character in notepad. I just don't care to type more than I need to when I can focus on problem solving and building. | | |
| ▲ | its-kostya 3 days ago | parent [-] | | While reading your comment it occured to me that people code at different abstraction levels. I do systems programming in golang and rust and I - like you - enjoy seeing my ideas come to life not so much the typing. The final result (how performant, how correct, how elegant and not complex) is in my control instead of an agent's; I enjoy having the creativity in the implementation. I can imagine other flavors of the profession working at higher abstraction layers and using more frameworks, where their result is dependant on how the framework executes. At that point, you might just want to connect all the frameworks/systems and get the feature out the door. And it is definitely a spectrum of languages, tools, frameworks that are more or less involved. The creativity in implementing (e.g an indexed array that, when it grows to large, gets reformated to a less performance hashmap) is what I imagine being lost and bring people satisfaction. Pulling that off in a clean and not in a complex way... well there is a certain reward in that. I don't have any long term proof but I also hypothesize it helps with maintainability. But I also see your point, sometimes I need a tool that does a function and I don't care to write it and giving the agent requirements and having it implemented is enough. But typically these tools are used and discarded. | | |
| ▲ | jmcodes 3 days ago | parent [-] | | Agreed 100% and I enjoy that part too, I just don't really see how that is being taken away. The way I see it these tools allow me to use my actual brainpower mostly on those problems. Because all the rote work can now be workably augmented away, I can choose which problems to actually focus on "by hand" as it were. I'd never give those problems to an LLM to solve. I might however ask it to search the web for papers or articles or what have you that have solved similar problems and go from there. If someone is giving that up then I'd question why they're doing that.. No one is forcing them to. It's the problem solving itself that is fun, the "layer" that it's in doesn't really make a difference to me. |
|
| |
| ▲ | theshrike79 3 days ago | parent | prev [-] | | But it's not exactly rewarding to add one more CRUD endpoint. It's a shit-ton of typing in multiple layers. An LLM can do it in two minutes while I fetch coffee, then I can proceed to add the complex bits (if there are any) | | |
| ▲ | kiitos a day ago | parent [-] | | > But it's not exactly rewarding to add one more CRUD endpoint. It's a shit-ton of typing in multiple layers. i don't disagree with you but if "adding one more CRUD endpoint" and similar rote tasks represent any significant amount of your engineering hours, especially in the context of business impact, then something is fundamentally broken in your team, engineering org, or company overall time spent typing code into an editor is usually, hopefully!, approximately statistically 0% of overall engineering time |
|
| |
| ▲ | skydhash 5 days ago | parent | prev [-] | | Code is the ultimate fact checker, where what you write is what gets done. Specs are well written wishes. | | |
| ▲ | jmcodes 5 days ago | parent [-] | | Yes, hence tests, linters, and actually verifying the changes it is making. You can't trust anything the LLM writes. It will hallucinate or misunderstand something at some point if your task gets long. But that's not the point, I'm not asking it to solve things for me. I'm using it to get faster at building my own understanding of the problem, what needs to get done, and then just executing the rote steps I've already figured out. Sometimes I get lucky and the feature is well defined enough just from the context gathering step that the implementation is literally just be hitting the enter key as I read the edits it wants to make. Sometimes I have to interrupt it and guide it a bit more as it works. Sometimes I realize I misunderstood something as it's thinking about what it needs to do. One-shotting or asking the LLM to think for you is the worst way to use them. |
|
|
|
| ▲ | simonw 5 days ago | parent | prev | next [-] |
| > Where are the "code-review" agents at? OpenAI's Codex Cloud just added a new feature for code review, and their new GPT-5-Codex model has been specifically trained for code review: https://openai.com/index/introducing-upgrades-to-codex/ Gemini and Claude both have code review features that work via GitHub Actions: https://developers.google.com/gemini-code-assist/docs/review... and https://docs.claude.com/en/docs/claude-code/github-actions GitHub have their own version of this pattern too: https://github.blog/changelog/2025-04-04-copilot-code-review... There are also a whole lot of dedicated code review startups like https://coderabbit.ai/ and https://www.greptile.com/ and https://www.qodo.ai/products/qodo-merge/ |
| |
| ▲ | vrighter 5 days ago | parent [-] | | you can't use a system with the exact same hallucination problem to check the work of another one just like it. Snake oil | | |
| ▲ | bcrosby95 5 days ago | parent | next [-] | | I don't think it's that simple. Fundamentally, unit tests are using the same system to write your invariants twice, it just so happens that they're different enough that failure in one tends to reveal a bug in another. You can't reasonably state this won't be the case with tools built for code review until the failure cases are examined. Furthermore a simple way to help get around this is by writing code with one product while reviewing the code with another. | | |
| ▲ | jmull 5 days ago | parent [-] | | > unit tests are using the same system to write your invariants twice For unit tests, the parts of the system that are the same are not under test, while the parts that are different are under test. The problem with using AI to review AI is that what you're checking is the same as what you're checking it with. Checking the output of one LLM with another brand probably helps, but they may also have a lot of similarities, so it's not clear how much. | | |
| ▲ | adastra22 3 days ago | parent | next [-] | | > The problem with using AI to review AI is that what you're checking is the same as what you're checking it with. This isn't true. Every instantiation of the LLM is different. Oversimplifying a little, but hallucination emerges when low-probability next words are selected. True explanations, on the other hand, act as attractors in state-space. Once stumbled upon, they are consistently preserved. So run a bunch of LLM instances in parallel with the same prompt. The built-in randomness & temperature settings will ensure you get many different answers, some quite crazy. Evaluate them in new LLM instances with fresh context. In just 1-2 iterations you will hone in on state-space attractors, which are chains of reasoning well supported by the training set. | |
| ▲ | Demiurge 5 days ago | parent | prev | next [-] | | What if you use a different AI model? Sometimes just a different seed generates a different result. I notice there is a benefit to seeing and contrasting the different answers. The improvement is gradual, it’s not a binary. | | |
| ▲ | adastra22 4 days ago | parent [-] | | You don't need to use a different model, generally. In my experience a fresh context window is all you need, the vast majority of the time. |
| |
| ▲ | bcrosby95 4 days ago | parent | prev [-] | | The system is the human writing the code. |
|
| |
| ▲ | adastra22 4 days ago | parent | prev | next [-] | | Yes you can, and this shouldn't be surprising. You can take the output of an LLM and feed it into another LLM and ask it to fact-check. Not surprisingly, these LLMs have a high false negative rate, meaning that it won't always catch the error. (I think you agree with me so far.) However the probability of these LLM failures are independent of each other, so long as you don't share context. The converse is that the LLM has a less-than-we-would-like probability of detecting a hallucination, but if it does then verification of that fact is reliable in future invocations. Combine this together: you can ask an LLM to do X, for any X, then take the output and feed it into some number of validation instances to look for hallucinations, bad logic, poor understanding, whatever. What you get back on the first pass will look like a flip of the coin -- one agent claims it is hallucination, the other agent says it is correct; both give reasons. But feed those reasons into follow-up verifier prompts, and repeat. You will find that non-hallucination responses tend to persist, while hallucinations are weeded out. The stable point is the truth. This works. I have workflows that make use of this, so I can attest to its effectiveness. The new-ish Claude Code sub-agent capabilities and slash commands are excellent for doing this, btw. | |
| ▲ | simonw 5 days ago | parent | prev | next [-] | | It's snake oil that works surprisingly well. | |
| ▲ | ben_w 5 days ago | parent | prev [-] | | Weirdly, you can not only do this, it somehow does actually catch some of its own mistakes. Not all of the mistakes, they generally still have a performance ceiling less than human experts (though even this disclaimer is still simplifying), but this kind of self-critique is basically what makes the early "reasoning" models one up over simple chat models: for the first-n :END: tokens, replace with "wait" and see it attempt other solutions and pick something usually better. | | |
| ▲ | vrighter 4 days ago | parent [-] | | the "pick something usually better" sounds a lot like "and then draw the rest of the f*** owl" | | |
| ▲ | ben_w 4 days ago | parent [-] | | Turned out that for a lot of things (not all things, Transformers have a lot of weaknesses), using a neural network to score an output is, if not "fine", then at least "ok". Generating 10 options with mediocre mean and some standard deviation, and then evaluating which is best, is much easier than deliberative reasoning to just get one thing right in the first place more often. |
|
|
|
|
|
| ▲ | aleph_minus_one 5 days ago | parent | prev | next [-] |
| >
Code review is part of the job, but one of the least enjoyable parts. Developers like _writing_ and that gives the most job satisfaction. At least for me, what gives the most satisfaction (even though this kind of satisfaction happens very rarely) if I discover some very elegant structure behind whatever has to be implemented that changes the whole way how you thought about programming (oroften even about life) for decades. |
| |
| ▲ | marklubi 4 days ago | parent [-] | | > what gives the most satisfaction (even though this kind of satisfaction happens very rarely) if I discover some very elegant structure behind whatever has to be implemented that changes the whole way how you thought about programming A number of years ago, I wrote a caching/lookup library that is probably some of the favorite code I've ever created. After the initial configuration, the use was elegant and there was really no reason not to use it if you needed to query anything that could be cached on the server side. Super easy to wrap just about any code with it as long as the response is serializable. CachingCore.Instance.Get(key, cacheDuration, () => { /* expensive lookup code here */ }); Under the hood, it would check the preferred caching solution (e.g., Redis/Memcache/etc), followed by less preferred options if the preferred wasn't available, followed by the expensive lookup if it wasn't found anywhere. Defaulted to in-memory if nothing else was available. If the data was returned from cache, it would then compare the expiration to the specified duration... If it was getting close to various configurable tolerances, it would start a new lookup in the background and update the cache (some of our lookups could take several minutes*, others just a handful of seconds). The hardest part was making sure that we didn't cause a thundering herd type problem with looking up stuff multiple times... in-memory cache flags indicating lookups in progress so we could hold up other requests if it failed through and then let them know once it's available. While not the absolute worst case scenario, you might end up making the expensive lookups once from each of the servers that use it if the shared cache isn't available. * most of these have a separate service running on a schedule to pre-cache the data, but things have a backup with this method. |
|
|
| ▲ | mercutio2 5 days ago | parent | prev | next [-] |
| Junior developers love writing code. Senior developers love removing code. Code review is probably my favorite part of the job, when there isn’t a deadline bearing down on me for my own tasks. So I don’t really agree with your framing. Code reviews are very fun. |
|
| ▲ | dearilos 12 hours ago | parent | prev | next [-] |
| I'm building something to solve exactly that - automating all the boring and repetitive parts of code review. |
|
| ▲ | KronisLV 5 days ago | parent | prev | next [-] |
| > Developers like _writing_ and that gives the most job satisfaction. Is it possible that this is just the majority and there’s plenty of folks that dislike actually starting from nothing and the endless iteration to make something that works, as opposed to have some sort of a good/bad baseline to just improve upon? I’ve seen plenty of people that are okay with picking up a codebase someone else wrote and working with the patterns and architecture in there BUT when it comes to them either needing to create new mechanisms in it or create an entirely new project/repo it’s like they hit a wall - part of it probably being friction, part not being familiar with it, as well as other reasons. > Why did we create tools that do the fun part and increase the non-fun part? Where are the "code-review" agents at? Presumably because that’s where the most perceived productivity gain is in. As for code review, there’s CodeRabbit, I think GitLab has their thing (Duo) and more options are popping up. Conceptually, there’s nothing preventing you from feeding a Git diff into RooCode and letting it review stuff, alongside reading whatever surrounding files it needs. |
| |
| ▲ | aleph_minus_one 3 days ago | parent [-] | | > I’ve seen plenty of people that are okay with picking up a codebase someone else wrote and working with the patterns and architecture in there BUT when it comes to them either needing to create new mechanisms in it or create an entirely new project/repo it’s like they hit a wall - part of it probably being friction, part not being familiar with it, as well as other reasons. For me, it's exactly the opposite: I love to build things from "nothing" (if I had the possibility, I would even like to write my own kernel that is written in a novel programming language developed by me :-) ). On the other hand, when I pick up someone else's codebase, I nearly always (if it was not written by some insanely smart programmer) immediately find it badly written. In nearly al cases I tend to be right in my judgements (my boss agrees), but I am very sensitive to bad code, and often ask myself how the programmer who wrote the original code did not yet commit seppuku, considering how much of a shame the code is. Thus: you can in my opinion only enjoy picking up a codebase someone else wrote if you are incredibly tolerant of bad code. |
|
|
| ▲ | crazygringo 5 days ago | parent | prev | next [-] |
| > Developers like _writing_ and that gives the most job satisfaction. Not me. I enjoy figuring out the requirements, the high-level design, and the clever approach that will yield high performance, or reuse of existing libraries, or whatever it is that will make it an elegant solution. Once I've figured all that out, the actual process of writing code is a total slog. Tracking variables, remembering syntax, trying to think through every edge case, avoiding off-by-one errors. I've gone from being an architect (fun) to slapping bricks together with mortar (boring). I'm infinitely happier if all that can be done for me, everything is broken out into testable units, the code looks plausibly correct, and the unit tests for each function cover all cases and are demonstrably correct. |
| |
| ▲ | pmg101 5 days ago | parent | next [-] | | You don't really know if the system design you've architected in your mind is any good though, do you, until you've actually tried coding it. Discovering all the little edge cases at that point is hard work ("a total slog") because it's where you find out where the flaws in your thinking were, and how your beautifully imagined abstractions fall down. Then after going back and forth between thinking about it and trying to build it a few times, after a while you discover the real solution. Or at least that's how it's worked for me for a few decades, everyone might be different. | | |
| ▲ | esafak 4 days ago | parent [-] | | He did not say he does not iterate! And it is much easier and faster to do when an LLM is involved. |
| |
| ▲ | skydhash 5 days ago | parent | prev [-] | | > Tracking variables, remembering syntax, That's why you have short functions so you don't have to track that many variable. And use symbol completion (a standard in many editors). > trying to think through every edge case, avoiding off-by-one errors. That is designing, not coding. Sometimes I think of an edge case, but I'm already on a task that I'd like to finish, so I just add a TODO comment. Then at least before I submit the PR, I ripgrep the project for this keyword and other. Sometimes the best design is done by doing. The tradeoffs become clearer when you have to actually code the solution (too much abstraction, too verbose, unwieldy,...) instead of relying on your mind (everything seems simpler) | | |
| ▲ | crazygringo 5 days ago | parent [-] | | You always have variables. Not just at the function level, but at the class level, object level, etc. And it's not about symbol completion, it's about remembering all the obscure differences in built-in function names and which does what. And no, off-by-one errors and edge cases are firmly part of coding, once you're writing code inside of a function. Edge cases are not "todos", they're correctly handling all possible states. > Sometimes the best design is done by doing. I mean, sure go ahead and prototype, rewrite, etc. That doesn't change anything. You can have the AI do that for you too, and then you can re-evaluate and re-design. The point is, I want to be doing that evaluation and re-designing. Not typing all the code and keeping track of loop states and variable conditions and index variables and exit conditions. That stuff is boring as hell, and I've written more than enough to last a lifetime already. | | |
| ▲ | skydhash 4 days ago | parent [-] | | > You always have variables. Not just at the function level, but at the class level, object level, etc. Aka the scope. And the namespace of whatever you want to access. Which is a design problem. > And it's not about symbol completion, it's about remembering all the obscure differences in built-in function names and which does what That's what references are for. And some IDEs bring it right alongside the editor. If not, you have online and offline references. You remember them through usage and semantics. > And no, off-by-one errors and edge cases are firmly part of coding, once you're writing code inside of a function. It's not. You define the happy path and error cases as part of the specs. But they're generally lacking in precision (full of ambiguities) and only care about the essential complexity. The accidental complexity comes as part of the platform and is also part of the design. Pushing those kind of errors as part of coding is shortsightedness. > Not typing all the code and keeping track of loop states and variable conditions and index variables and exit conditions. That stuff is boring as hell, and I've written more than enough to last a lifetime already That is like saying "Not typing all the text and keeping track of words and punctuation and paragraphs and signatures. English is boring as hell and I've written more than enough..." If you don't like formality, say so. I've never had anyone describe coding as you did. No one things about those stuff that closely. It's like a guitar player complaining about which strings to strike with a finger. Or a race driver complaining about the angle of the steering wheel and having to press the brake. | | |
| ▲ | crazygringo 4 days ago | parent [-] | | I don't know what to tell you. Sure there are tools like IDE's to help, but it doesn't help with everything. The simple fact is that I find there's very little creative satisfaction to be found in writing most functions. Once you've done it 10,000 times, it's not exactly fun anymore, I mean unless you're working on some cutting-edge algorithm which is not what we're doing 99.9% of the time. The creative part becomes in the higher level of design, where it's no longer rote. This is the whole reason why people move up into architecture roles, designing systems and libraries and API's instead of writing lines of code. The analogies with guitar players or race car drivers or writers are flawed, because nothing they do is rote. Every note matters, every turn, every phrase. They're about creativity and/or split-second decision making. But when you're writing code, that's just not the case. For anything that's a 10- or 20- line function, there isn't usually much creativity there, 99.99% of the time. You're just translating an idea into code in a straightforward way. So when you say, "Developers like _writing_ and that gives the most job satisfaction." That's just not true. Especially not for many experienced devs. Developers like thinking, in my experience. They like designing, the creative part. Not the writing part. The writing is just the means to the end. |
|
|
|
|
|
| ▲ | phito 5 days ago | parent | prev | next [-] |
| Because the goal of "AI" is not to have fun, it's to solve problems and increase productivity. I have fun programming too, but you have to realize the world isn't optimizing make things more fun. |
| |
| ▲ | fhd2 5 days ago | parent | next [-] | | I hear you, but without any enjoyment in the process, quality and productivity go down the drain real fast. The Ironies of Automation paper is something I mention a lot, the core thesis is that making humans review / rubber stamp automation reduces their work quality. People just aren't wired to do boring stuff well. | | |
| ▲ | skydhash 5 days ago | parent [-] | | Enjoyment and rewards are the drivers for motivation. | | |
| ▲ | fhd2 5 days ago | parent [-] | | Yeah, though in my experience, reward alone is not enough. |
|
| |
| ▲ | lapcat 5 days ago | parent | prev [-] | | > you have to realize the world isn't optimizing make things more fun. Serious question: why not? IMO it should be. If "progress" is making us all more miserable, then what's the point? Shouldn't progress make us happier? It feels like the endgame of AI is that the masses slave away for the profit of a few tech overlords. | | |
| ▲ | phito 5 days ago | parent | next [-] | | As a human, I do agree that it would be better and we should strive for that. However I don't think humans are really driving all this progress/innovation. It is just evolution keeping doing what it's always done, it is ruthless and does not care at all whether we are having fun or not. | |
| ▲ | cindyllm 5 days ago | parent | prev [-] | | [dead] |
|
|
|
| ▲ | cmrdporcupine 5 days ago | parent | prev [-] |
| If you have a paid Copilot membership and a Github project you can request a code review from Copilot. And it doesn't do a terrible job, actually. |
| |
| ▲ | sublinear 5 days ago | parent [-] | | I will second this. I believe code review agents and search summaries are the way forward for coding with LLMs. The ability to ignore AI and focus on solving the problems has little to do with "fun". If anything it leaves a human-auditable trail to review later and hold accountable devs who have gone off the rails and routinely ignored the sometimes genuinely good advice that comes out of AI. If humans don't have to helicopter over developers, that's a much bigger productivity boost than letting AI take the wheel. This is a nuance missed by almost everyone who doesn't write code or care about its quality. |
|