| ▲ | Every layer of review makes you 10x slower(apenwarr.ca) |
| 207 points by greyface- 6 hours ago | 99 comments |
| |
|
| ▲ | alkonaut 2 minutes ago | parent | next [-] |
| I think this makes an assumption early on which is that things are serialized, when usually they are not. If I complete a bugfix every 30 minutes, and submit them all for review, then I really don't care whether the review completes 5 hours later. By that time I have fixed 10 more bugs! Sure, getting review feedback 5 hours later will force me to context switch back to 10 bugs ago and try to remember what that was about, and that might mean spending a few more minutes than necessary. But that time was going to be spent _anyway_ on that bug, even if the review had happened instantly. The key to keeping speed up in slow async communication is just working on N things at the same time. |
|
| ▲ | onion2k 4 hours ago | parent | prev | next [-] |
| But you can’t just not review things! Actually you can. If you shift the reviews far to the left, and call them code design sessions instead, and you raise problems on dailys, and you pair programme through the gnarly bits, then 90% of what people think a review should find goes away. The expectation that you'll discover bugs and architecture and design problems doesn't exist if you've already agreed with the team what you're going to build. The remain 10% of things like var naming, whitespace, and patterns can be checked with a linter instead of a person. If you can get the team to that level you can stop doing code reviews. You also need to build a team that you can trust to write the code you agreed you'd write, but if your reviews are there to check someone has done their job well enough then you have bigger problems. |
| |
| ▲ | loire280 3 hours ago | parent | next [-] | | I've seen engineers I respect abandon this way of working as a team for the productivity promise of conjuring PRs with a coding agent. It blows away years of trust so quickly when you realize they stopped reviewing their own output. | | |
| ▲ | overfeed 2 hours ago | parent | next [-] | | Perhaps due to FOMO outbreak[1], upper management everywhere has demanded AI-powered productivity gains, based on LoC/PR metrics, it looks like they are getting it. 1. The longer I work in this industry, the more it becomes clear that CxO's aren't great at projecting/planning, and default to copy-cat, herd behaviors when uncertain. | | |
| ▲ | tripledry an hour ago | parent [-] | | Would love to be a fly on the wall for a couple of months to see what corporate CxO's actually do. Surely I could do a mediocre job as a CxO by parroting whatever is hot on Linkedin. Probably wouldn't be a massively successful one, but good enough to survive 2 years and have millions in the bank for that, or get fired and get a golden parachute. (half) joking - most likely I'm massively trivializing the role. | | |
| ▲ | eptcyka an hour ago | parent [-] | | A charitable explanation for what CxOs do is that they figure out their strategic goals and then focus really hard on ways to herd cats en masse to achieve the goals in an efficient manner. Some people end up doing a great job, some do so accidentally, other just end up doing a job. Sometimes parroting some linkadink drivel is enough to keep the ship on course - usually because the winds are blowing in the right direction or the people at the oars are working well enough on their own. |
|
| |
| ▲ | onion2k 3 hours ago | parent | prev [-] | | Putting too much trust in an agent is definitely a problem, but I have to admit I've written about a dozen little apps in the past year without bothering to look at the code and they've all worked really well. They're all just toys and utilities I've needed and I've not put them into a production system, but I would if I had to. Agents are getting really good, and if you're used to planning and designing up front you can get a ton of value from them. The main problem with them that I see today is people having that level of trust without giving the agent the context necessary to do a good job. Accepting a zero-shotted service to do something important into your production codebase is still a step too far, but it's an increasingly small step. | | |
| ▲ | camillomiller 2 hours ago | parent [-] | | >> Putting too much trust in an agent is definitely a problem, but I have to admit I've written about a dozen little apps in the past year without bothering to look at the code and they've all worked really well. They're all just toys and utilities I've needed and I've not put them into a production system, but I would if I had to. I have been doing this to, and I've forgotten half of them. For me the point is that this usage scenario is really good, but it also has no added value to it, really. The moment Claude Code raises it prices 2x this won't be viable anymore, and at the same time to scale this to enterprise software production levels you need to spend on an agent probably as much as hiring two SWEs, given that you need at least one to coordinate the agents. | | |
| ▲ | jeremyjh an hour ago | parent | next [-] | | Deepseek v3.2 tokens are $0.26/0.38 on OpenRouter. That model - released 4 months ago - isn't really good enough by today's standards, but its significantly stronger than Opus 4.1, which was only released last August! In 12 months I think its reasonable to expect there will be a model with less cost than that which is significantly stronger than anything available now. And no, it isn't ONLY because VC capital is being burned to subsidize cost. That is impossible for the dozen smaller providers offering service at that cost on OpenRouter who have to compete with each other for every request and also have to pay compute bills. Qwen3.5-9B is stronger than GPT-4o and it runs on my laptop. That isn't just benchmarks either. Models are getting smaller, cheaper and better at the same time and this is going to continue. | |
| ▲ | onion2k 2 hours ago | parent | prev [-] | | I think Claude could raise it's prices 100x and people would still use it. It'd just shift to being an enterprise-only option and companies would actually start to measure the value instead of being "Whee, AI is awesome! We're definitely going really fast now!" |
|
|
| |
| ▲ | riffraff 3 hours ago | parent | prev | next [-] | | This is also the premise of pair programming/extreme programming: if code review is useful, we should do it all the time. | |
| ▲ | Swizec 3 hours ago | parent | prev | next [-] | | > You also need to build a team that you can trust to write the code you agreed you'd write I tell every hire new and old “Hey do your thing, we trust you. Btw we have your phone number. Thanks” Works like a charm. People even go out of their way to write tests for things that are hard to verify manually. And they verify manually what’s hard to write tests for. The other side of this is building safety nets. Takes ~10min to revert a bad deploy. | | |
| ▲ | pdhborges 17 minutes ago | parent | next [-] | | > The other side of this is building safety nets. Takes ~10min to revert a bad deploy. Does it? Reverting a bad deploy is not only about running the previous version. Did you mess up data? Did you take actions on third party services that that need to be reverted? Did it have legal reprecursions? | |
| ▲ | namanyayg 14 minutes ago | parent | prev [-] | | How does the phone number help? |
| |
| ▲ | froh 2 hours ago | parent | prev | next [-] | | yes! and it also works for me when working with ai. that produces much better results, too, when I first so a design session really discussing what to build. then a planning session, in which steps to build it ("reviewability" world wonder). and then the instruction to stop when things get gnarly and work with the hooman. does anyone here have a good system prompt for that self observance "I might be stuck, I'm kinda sorta looping. let's talk with hooman!"? | |
| ▲ | totetsu 3 hours ago | parent | prev | next [-] | | This seems to be a core of the problem with trying to leave things to autonomous agents .. The response to Amazons agents deleting prod was to implement review stages https://blog.barrack.ai/amazon-ai-agents-deleting-production... | |
| ▲ | ozim an hour ago | parent | prev | next [-] | | Then you spend all your budget on code design sessions and have nothing to show to the customer. | |
| ▲ | ramon156 2 hours ago | parent | prev | next [-] | | I'm in a company that does no reviews and I'm medior. The tools we make is not interesting at all, so it's probably the best position I could ask for. I occasionally have time to explore some improvements, tools and side projects (don't tell my boss about that last one) | |
| ▲ | thrwaway55 25 minutes ago | parent | prev | next [-] | | Okay but Claude is a fucking moron. | |
| ▲ | DeathArrow 20 minutes ago | parent | prev | next [-] | | The issue is that every review adds a lot of delay. A lot of alignment and pair programming won't be time expensive? | |
| ▲ | hinkley 2 hours ago | parent | prev | next [-] | | These systems make it more efficient to remove the actively toxic members for your team. Beligerence can be passively aggressively “handled” by additional layers but at considerable time and emotional labor cost to people who could be getting more work done without having to coddle untalented assholes. | |
| ▲ | anal_reactor 3 hours ago | parent | prev | next [-] | | I never review PRs, I always rubber-stamp them, unless they come from a certified idiot: 1. I don't care because the company at large fails to value quality engineering. 2. 90% of PR comments are arguments about variable names. 3. The other 10% are mistakes that have very limited blast radius. It's just that, unless my coworker is a complete moron, then most likely whatever they came up with is at least in acceptable state, in which case there's no point delaying the project. Regarding knowledge share, it's complete fiction. Unless you actually make changes to some code, there's zero chance you'll understand how it works. | | |
| ▲ | recursivecaveat 3 hours ago | parent | next [-] | | Do people really argue about variable names? Most reviews comments I see are fairly trivial, but almost always not very subjective. (Leftover debug log, please add comment here, etc) Maybe it helps that many of our seniors are from a team where we had no auto-formatter or style guide at all for quite a while. I think everyone should experience that a random mix of `){` and `) {` does not really impact you in any way beyond the mild irking of a crooked painting or something. There's a difference between aesthetically bothersome and actually harmful. Not to say that you shouldn't run a formatter, but just for some perspective. | | |
| ▲ | jffhn an hour ago | parent | next [-] | | >Do people really argue about variable names? Of course they do. A program's code is mostly a graph of names; they can be cornerstones of its clarity, or sources of confusion and bugs. The first thing I do when debugging is ensuring proper names, sometimes that's enough to make the bug obvious. | |
| ▲ | anal_reactor 2 hours ago | parent | prev [-] | | Yes. 80% of comments to my PRs are "change _ to -" or something like that. | | |
| ▲ | blitzar 21 minutes ago | parent [-] | | PR #467 - Reformat code from tabs to spaces PR #515 - Reformat code from spaces to tabs |
|
| |
| ▲ | _kidlike an hour ago | parent | prev | next [-] | | I'm very surprised by these comments... I regularly review code that is way more complicated that it should. The last few days I was going back and forth on reviews on a function that had originally cyclomatic complexity of 23. Eventually I got it down to 8, but I had to call him into a pair programming session and show him how the complexity could be reduced. | | |
| ▲ | servo_sausage an hour ago | parent [-] | | Someone giving work like that should be either junior enough that there is potential for training them, so your time investment is worth it, or managed out. Or it didn't really matter that the function was complex if the structure of what's surrounding it was robust and testable; just let it be a refactor or bug ticket later. |
| |
| ▲ | worldsayshi 2 hours ago | parent | prev | next [-] | | People always makes mistakes. Like forgetting to include a change. The point of PRs for me is to try to weed out costly mistakes. Automated tests should hopefully catch most of them though. | |
| ▲ | devmor 3 hours ago | parent | prev [-] | | I used to do this! I can’t anymore, not with the advent of AI coding agents. My trust in my colleagues is gone, I have no reason to believe they wrote the code they asked me to put my approval on, and so I certainly don’t want to be on a postmortem being asked why I approved the change. Perhaps if I worked in a different industry I would feel like you do, but payments is a scary place to cause downtime. |
| |
| ▲ | rendall 2 hours ago | parent | prev | next [-] | | Yes. This is the way. Declarative design contracts are the answer to A.I. coders. A team declares what they want, agents code it together with human supervision. Then code review is just answering the question "is the code conformant with the design contract?" But. The design contract needs review, which takes time. | |
| ▲ | jauntywundrkind 3 hours ago | parent | prev [-] | | I wonder what delayed continuous release would be like. Trust folks to merge semi-responsibly, but have a two week delay before actually shipping to give yourself some time to find and fix issues. Perhaps kind of a pain to inject fixes in, have to rebase the outstanding work. But I kind of like this idea of the org having responsibility to do what review it wants, without making every person have to coral all the cats to get all the check marks. Make it the org's challenge instead. |
|
|
| ▲ | trigvi 2 minutes ago | parent | prev | next [-] |
| Excellent article. Based on personal experience, if you build cutting edge stuff then you need great engineers and reviewers. But for anything else, you just need an individual (not a team) who's okay (not great) at multiple things (architecting, coding, communicating, keeping costs down, testing their stuff). Let them build and operate something from start to finish without reviewing. Judge it by how well their produce works. |
|
| ▲ | afc 2 minutes ago | parent | prev | next [-] |
| Waiting for a few days of design review is a pain that is easy to avoid: all we need is to be ready to spend a few months building a potentially useless system. |
|
| ▲ | nfw2 5 minutes ago | parent | prev | next [-] |
| from article: 1. Whoa, I produced this prototype so fast! I have super powers! 2. This prototype is getting buggy. I’ll tell the AI to fix the bugs. 3. Hmm, every change now causes as many new bugs as it fixes. 4. Aha! But if I have an AI agent also review the code, it can find its own bugs! 5. Wait, why am I personally passing data back and forth between agents 6. I need an agent framework 7. I can have my agent write an agent framework! 8. Return to step 1 the author seems to imply this is recursive when it isn't. when you have an effective agent framework you can ship more high quality code quickly. |
|
| ▲ | pu_pe 12 minutes ago | parent | prev | next [-] |
| Nice piece, and rings true. I also think startups and smaller organizations will be able to capture better value out of AI because they simply don't have all those approval layers. |
|
| ▲ | thot_experiment 4 hours ago | parent | prev | next [-] |
| Valve is one of the only companies that appears to understand this, as well as that individual productivity is almost always limited by communication bandwidth, and communication burden is exponential as nodes in the tree/mesh grow linearly. [or some derated exponent since it doesn't need to be fully connected] |
| |
| ▲ | MrBuddyCasino 16 minutes ago | parent [-] | | The first one to realise this was Jeff Bezos, afaik. One would think the others have wisened up in the meantime, but no. |
|
|
| ▲ | lelanthran 4 hours ago | parent | prev | next [-] |
| I wonder where the reviewer worked where PRs are addressed in 5 hours. IME it's measured in units of days, not hours. I agree with him anyway: if every dev felt comfortable hitting a stop button to fix a bug then reviewing might not be needed. The reality is that any individual dev will get dinged for not meeting a release objective. |
| |
| ▲ | usr1106 2 hours ago | parent | next [-] | | I worked in a company where reviews took days. The CTO complained a lot about the speed, but we had decent code quality. Now I work at a company where reviews take minutes. We have 5 lines of technical debt per 3 lines of code written. We spend months to work on complicated bugs that have made it to production. | |
| ▲ | ivanjermakov 35 minutes ago | parent | prev | next [-] | | I'm yet to see a project where reviews are handled seriously. Both business and developers couldn't care less. | | |
| ▲ | eterm 28 minutes ago | parent | next [-] | | I worked somewhere that actually had a great way to deal with this. It only works in small teams though. We had a "support rota", i.e. one day a week you'd be essentially excused from doing product delivery. Instead, you were the dev to deal with big triage, any code reviews, questions about the product, etc. Any spare time was spent looking for bugs in the backlog to further investigate / squash. Then when you were done with your support day you were back to sprint work. This meant there was no ambiguity of who to ask for code review, and limited / eliminated siloing of skills since everyone had to be able to review anyone else's work. That obviously doesn't scale to large teams, but it worked wonders for a small team. | |
| ▲ | mcdeltat 26 minutes ago | parent | prev [-] | | Bonus points: reviews are not taken seriously in the legitimate sense, but a facade of seriousness consisting of picky complaints is put forth to reinforce hierarchy and gatekeeping |
| |
| ▲ | jannyfer 4 hours ago | parent | prev | next [-] | | At the bottom of the page it says he is CEO of Tailscale. | |
| ▲ | devmor 3 hours ago | parent | prev [-] | | I’ve worked on teams like you describe and it’s been terrible. My current team’s SDLC is more along the 5-hour line - if someone hasn’t reviewed your code by the end of today, you bring it up in standup and have someone commit to doing it. |
|
|
| ▲ | lukaslalinsky an hour ago | parent | prev | next [-] |
| Reviewing things is fast and smooth is things are small. If you have all the involved parties stay in the loop, review happens in the real time. Review is only problematic if you split the do and review steps. The same applies to AI coding, you can chose to pair program with it and then it's actually helpful, or you can have it generate 10k lines of code you have no way of reviewing. You just need people understand that switching context is killing productivity. If more things are happening at the same time and your memory is limited, the time spent on load/save makes it slower than just doing one thing at the time and staying in the loop. |
| |
| ▲ | rafaelmn an hour ago | parent [-] | | Honestly if I'm just following what a single LLM is doing I'm arguably slower than doing it myself so I'd say that approach isn't very useful for me. I prefer to review plan (this is more to flush out my assumptions about where something fits in the codebase and verify I communicated my intent correctly). I'll loosely monitor the process if it's a longer one - then I review the artifacts. This way I can be doing 2/3 things in parallel, using other agents or doing meetings/prod investigation/making coffee/etc. |
|
|
| ▲ | superlopuh an hour ago | parent | prev | next [-] |
| In my experience a culture where teammates prioritise review times (both by checking on updates in GH a few times a day, and by splitting changes agressively into smaller patches) is reflected in much faster overall progress time. It's definitely a culture thing, there's nothing technically or organisationally difficult about implementing it, it just requires people working together considering team velocity more important than personal velocity. |
| |
| ▲ | threatofrain an hour ago | parent [-] | | Let's say a teammate is writing code to do geometric projection of streets and roads onto live video. Another teammate is writing code to do automated drone pursuit of cars. Let's say I'm over here writing auth code, making sure I'm modeling all the branches which might occur in some order. To what degree do we expect intellectual peerage from someone just glancing into this problem because of a PR? I would expect that to be the proper intellectual peer of someone studying the problem, it's quite reasonable to basically double your efforts. | | |
| ▲ | pm215 21 minutes ago | parent | next [-] | | If the team is that small and working on things that are that disparate, then it is also very vulnerable to one of those people leaving, at which point there's a whole part of the project that nobody on the team has a good understanding of. Having somebody else devote enough time to being up to speed enough to do code review on an area is also an investment in resilience so the team isn't suddenly in huge difficulty if the lone expert in that area leaves. It's still a problem, but at least you have one other person who's been looking at the code and talking about it with the now-departed expert, instead of nobody. | |
| ▲ | servo_sausage 37 minutes ago | parent | prev [-] | | This is an unusually low overlap per topic; probably needs a different structure to traditional prs to get the best chance to benefit from more eyes... Higher scope planning or something like longer but intermittent partner programming. Generally if the reviewer is not familiar with the content asynchronous line by line reviews are of limited value. |
|
|
|
| ▲ | tptacek 4 hours ago | parent | prev | next [-] |
| Not before coding agents nor after coding agents has any PR taken me 5 hours to review. Is the delay here coordination/communication issues, the "Mythical Mammoth" stuff? I could buy that. |
| |
| ▲ | Aurornis 4 hours ago | parent | next [-] | | The article is referring to the total time including delays. It isn’t saying that PR review literally takes 5 hours of work. It’s saying you have to wait about half a day for someone else to review it. | | |
| ▲ | yxhuvud 2 hours ago | parent [-] | | Which is a thing that depend very much on team culture. In my team it is perhaps 15 min for smaller fixes to get signoff. There is a virtuous feedback loop here - smaller PRs give faster reviews, but also more frequent PRs, which give more frequent times to actually check if there is something new to review. |
| |
| ▲ | sevenseacat 19 minutes ago | parent | prev | next [-] | | I've had PRs that take me five hours to review. If your one PR is an entire feature that touches the database, the UI, and an API, and I have to do the QA on every part of it because as soon as I give the thumbs up it goes out the door to clients? Then its gonna take a while and I'm probably going to find a few critical issues and then the loop starts again | |
| ▲ | abtinf 4 hours ago | parent | prev | next [-] | | The PR won’t take 5 hours of work, but it could easily sit that long waiting for another engineer to willing to context switch from their own heads-down work. | | |
| ▲ | paulmooreparks 4 hours ago | parent | next [-] | | Exactly. Even if I hammer the erstwhile reviewer with Teams/Slack messages to get it moved to the top of the queue and finished before the 5 hours are up, then all the other reviews get pushed down. It averages out, and the review market corrects. | |
| ▲ | bsjshshsb 2 hours ago | parent | prev [-] | | Exaxtly. Can you get a lawyer on the phone now or do you wait ~ 5 hours. How about a doctor appt. Or a vet appt. Or a mechanic visit. Needing full human attention on a co.plex task from a pro who can only look at your thing has a wait time. It is worse when there are only 2 or 3 such people in the world you can ask! |
| |
| ▲ | nixon_why69 4 hours ago | parent | prev | next [-] | | The article specified wall clock time. One day turnaround is pretty typical if its not urgent enough to demand immediate review, lots of people review incoming PRs as a morning activity. | |
| ▲ | ukuina 2 hours ago | parent | prev | next [-] | | > "Mythical Mammoth" Most excellent. | |
| ▲ | lelanthran 4 hours ago | parent | prev | next [-] | | Some devs interrupt what they are doing when they see a PR in a Slack notification, most don't. Most devs set aside some time at most twice a day for PRs. That's 5 hours at least. Some PRs come in at the end of the day and will only get looked at the next day. That's more than 5 hours. IME it's rare to see a PR get reviewed in under 5 hours. | | |
| ▲ | CBLT 3 hours ago | parent | next [-] | | I use a PR notifier chrome extension, so I have a badge on the toolbar whenever a PR is waiting on me. I get to them in typically <2 minutes during work hours because I tab over to chrome whenever AI is thinking. Sometimes I even get to browse HN if not enough PRs are coming and not too many parallel work sessions. | |
| ▲ | riffraff 3 hours ago | parent | prev [-] | | But there's more than one person that can review a PR. If you work in a team of 5 people, and each one only reviews things twice a day, that's still less than 5 hours any way you slice it. |
| |
| ▲ | squirrellous an hour ago | parent | prev [-] | | One pattern I've seen is that a team with a decently complex codebase will have 2-3 senior people who have all of the necessary context and expertise to review PRs in that codebase. They also assign projects to other team members. All other team members submit PRs to them for review. Their review queue builds up easily and average review time tanks. Not saying this is a good situation, but it's quite easy to run into it. |
|
|
| ▲ | codemog 2 hours ago | parent | prev | next [-] |
| This reads like a scattered mind with a few good gems, a few assumptions that are incorrect but baked into the author’s world view, and loose coherence tying it all together. I see a lot of myself in it. I’ll cover one of them: layers of management or bureaucracy does not reduce risk. It creates in-action, which gives the appearance of reducing risk, until some startup comes and gobbles up your lunch. Upper management knows it’s all bullshit and the game theoretic play is to say no to things, because you’re not held accountable if you say no, so they say no and milk the money printer until the company stagnates and dies. Then they repeat at another company (usually with a new title and promotion). |
|
| ▲ | abtinf 4 hours ago | parent | prev | next [-] |
| I find to be true for expensive approvals as well. If I can approve something without review, it’s instant. If it requires only immediate manager, it takes a day. Second level takes at least ten days. Third level trivially takes at least a quarter (at least two if approaching the end of the fiscal year). And the largest proposals I’ve pushed through at large companies, going up through the CEO, take over a year. |
|
| ▲ | simianwords an hour ago | parent | prev | next [-] |
| I don’t agree that AI can’t fix this. It is too easy to dismiss. With AI my task to review is to see high level design choices and forget reviewing low level details. It’s much simpler. |
|
| ▲ | p0w3n3d 4 hours ago | parent | prev | next [-] |
| Meanwhile there are people who, as we speak, say that AI will do review and all we need to do is to provide quality gates... |
| |
|
| ▲ | halo an hour ago | parent | prev | next [-] |
| In my experience, good mature organisations have clear review processes to ensure quality, improve collaboration and reduce errors and risk. This is regardless of field. It does slow you down - not 10x - but the benefits outweigh the downsides in the long run. The worst places I’ve worked have a pattern where someone senior drives a major change without any oversight, review or understanding causing multiple ongoing issues. This problem then gets dumped onto more junior colleagues, at which point it becomes harder and more time consuming to fix (“technical debt”). The senior role then boasts about their successful agile delivery to their superiors who don’t have visibility of the issues, much to the eye-rolls of all the people dealing with the constant problems. |
|
| ▲ | riffraff 3 hours ago | parent | prev | next [-] |
| > Code a simple bug fix
30 minutes > Get it code reviewed by the peer next to you
300 minutes → 5 hours → half a day Is it takes 5 hours for a peer to review a simple bugfix your operation is dysfunctional. |
| |
| ▲ | thi2 2 hours ago | parent | next [-] | | Its rare that devs are on standby, waiting for a pr to review. Usually they are working on their own pr, are in meetings, have focus time. We talked a lot about the costs of context switches so its reasonable to finish your work before switching to the review. | |
| ▲ | habinero 3 hours ago | parent | prev | next [-] | | People are busy, and small bugfixes are usually not that critical. If you make everyone drop everything to review everything, that is much more dysfunctional. | |
| ▲ | karel-3d an hour ago | parent | prev [-] | | nobody will immediately jump on your code review |
|
|
| ▲ | DeathArrow 24 minutes ago | parent | prev | next [-] |
| I totally agree with his ideas, but somehow he seems just stating the obvious: startups move better than big orgs and you can solve a problem by dividing it in smaller problems - if possible. And that AI experimentation is cheap. |
|
| ▲ | PunchyHamster 15 minutes ago | parent | prev | next [-] |
| > I know what you're thinking. Come on, 10x? That’s a lot. It’s unfathomable. Surely we’re exaggerating. See this rarely known trick! You can be up to 9x more efficient if you code something else when you wait for review > AI projectile vomits Fuck engineering, let's work on methods to make artificial retard be more efficient! |
|
| ▲ | sublinear 4 hours ago | parent | prev | next [-] |
| As they say: an hour of planning saves ten hours of doing. You don't need so much code or maintenance work if you get better requirements upfront. I'd much rather implement things at the last minute knowing what I'm doing than cave in to the usual incompetent middle manager demands of "starting now to show progress". There's your actual problem. |
| |
| ▲ | lmm 4 hours ago | parent [-] | | > As they say: an hour of planning saves ten hours of doing. In software it's the opposite, in my experience. > You don't need so much code or maintenance work if you get better requirements upfront. Sure, and if you could wave a magic wand and get rid of all your bugs that would cut down on maintenance work too. But in the real world, with the requirements we get, what do we do? | | |
| ▲ | JoshTriplett 3 hours ago | parent [-] | | > In software it's the opposite, in my experience. That's been my experience as well: ten hours of doing will definitely save you an hour of planning. If you aren't getting requirements from elsewhere, at least document the set of requirements you think you're working towards, and post them for review. You sometimes get new useful requirements very fast if you post "wrong" ones. | | |
| ▲ | seer 2 hours ago | parent [-] | | I think what they meant is you “can save 10 hours of planning with one hour of doing” And I think this has become even more so with the age of ai, because there is even more unknown unknowns, which is harder to discover while planning, but easy wile “doing” and that “doing” itself is so much more streamlined. In my experience no amount of planning will de-risk software engineering effort, what works is making sure coming back and refactoring, switching tech is less expensive, which allows you to rapidly change the approach when you inevitably discover some roadblock. You can read all the docs during planning phases, but you will stumble with some undocumented behaviour / bug / limitation every single time and then you are back to the drawing board. The faster you can turn that around the faster you can adjust and go forward. I really like the famous quote from Churchill- “Plans are useless, planning is essential” | | |
| ▲ | kmijyiyxfbklao 12 minutes ago | parent | next [-] | | Planning includes the prototype you build with AI. | |
| ▲ | JoshTriplett an hour ago | parent | prev [-] | | > I think what they meant is you “can save 10 hours of planning with one hour of doing” I know what they meant, and I also meant the thing I said instead. I have seen many, many people forge ahead on work that could have been saved by a bit more planning. Not overplanning, but doing a reasonable amount of planning. Figuring out where the line is between planning and "just start trying some experiments" is a matter of experience. |
|
|
|
|
|
| ▲ | usr1106 3 hours ago | parent | prev | next [-] |
| What makes me slower is the moment is the AI slop my team lead posts into reviews. I have to spend time to argue why that's not a valid comment. |
|
| ▲ | camillomiller 2 hours ago | parent | prev | next [-] |
| >> Now you either get to spend 27 minutes reviewing the code yourself in a back-and-forth loop with the AI (this is actually kinda fun); or you save 27 minutes and submit unverified code to the code reviewer, who will still take 5 hours like before, but who will now be mad that you’re making them read the slop that you were too lazy to read yourself. Little of value was gained. This seems to check out, and it's the reason why I can't reconcile the claims of the industry about workers replacement with reality. I still wonder when a reckoning will come, though. seems long overdue in the current environment |
|
| ▲ | markbao 4 hours ago | parent | prev | next [-] |
| If you save 3 hours building something with agentic engineering and that PR sits in review for the same 30 hours or whatever it would have spent sitting in review if you handwrote it, you’re still saving 3 hours building that thing. So in that extra time, you can now stack more PRs that still have a 30 hour review time and have more overall throughput (good lord, we better get used to doing more code review) This doesn’t work if you spend 3 minutes prompting and 27 minutes cleaning up code that would have taken 30 minutes to write anyway, as the article details, but that’s a different failure case imo |
| |
| ▲ | lelanthran 3 hours ago | parent | next [-] | | > So in that extra time, you can now stack more PRs that still have a 30 hour review time and have more overall throughput Hang on, you think that a queue that drains at a rate of $X/hour can be filled at a rate of 10x$X/hour? No, it cannot: it doesn't matter how fast you fill a queue if the queue has a constant drain rate, sooner or later you are going to hit the bounds of the queue or the items taken off the queue are too stale to matter. In this case, filling a queue at a rate of 20 items per hour (every 3 minutes) while it drains at a rate of 1 item every 5 hours means that after a single day, you can expect your last PR to be reviewed in ((8x20) - 1) hours. IOW, after a single day the time-to-review is 159 hours. Your PRs after the second day is going to take +300 hours. | | |
| ▲ | zmmmmm 3 hours ago | parent | next [-] | | This is the fundamental issue currently in my situation with AI code generation. There are some strategies that help: a lot of the AI directives need to go towards making the code actually easy to review. A lot of it it sits around clarity, granularity (code should be committed primarily in reviewable chunks - units of work that make sense for review) rather than whatever you would have done previously when code production was the bottleneck. Similarly, AI use needs to be weighted not just more towards tests, but towards tests that concretely and clearly answer questions that come up in review (what happens on this boundary condition? or if that variable is null? etc). Finally, changes need to be stratified along lines of risk rather than code modularity or other dimensions. That is, if a change is evidently risk free (in the sense of, "even if this IS broken it doesn't matter) it should be able to be rapidly approved / merged. Only things where it actually matters if it wrong should be blocked. I have a feeling there are whole areas of software engineering where best practices are just operating on inertia and need to be reformulated now that the underlying cost dynamics have fundamentally shifted. | | |
| ▲ | balamatom 2 hours ago | parent [-] | | >Finally, changes need to be stratified along lines of risk rather than code modularity or other dimensions. Why don't those other dimensions, and especially the code modularity, already reflect the lines of business risk? Lemme guess, you cargo culted some "best practices" to offload risk awareness, so now your code is organized in "too big to fail" style and matches your vendor's risk profile instead of yours. | | |
| ▲ | zmmmmm 2 hours ago | parent [-] | | > Why don't those other dimensions, and especially the code modularity, already reflect the lines of business risk? I guess the answer (if you're really asking seriously) is that previously when code production cost so far outweighed everything else, it made sense to structure everything to optimise efficiency in that dimension. So if a change was implemented, the developer would deliver it as a functional unit that might cut across several lines of risk (low risk changes like updating some CSS sitting along side higher risk like a database migration, all bundled together). Because this was what made it fastest for the developer to implement the code. Now if AI is doing it, screw how easy or fast it is to make the change. Deliver it in review chunks. Was the original method cargo culted? I think most of what we do is cargo culted regardless. Virtually the entire software industry is built that way. So probably. |
|
| |
| ▲ | balamatom 2 hours ago | parent | prev [-] | | You are considering a good-faith environment where GP cares about throughput of the queue. I think GP is thinking in terms of being incentivized by their environment to demonstrate an image of high personal throughput. In a dysfunctional organization one is forced to overpromise and underdeliver, which the AI facilitates. |
| |
| ▲ | josephg 4 hours ago | parent | prev | next [-] | | If your team's bottleneck is code review by senior engineers, adding more low quality PRs to the review backlog will not improve your productivity. It'll just overwhelm and annoy everyone who's gotta read that stuff. Generally if your job is acting as an expensive frontend for senior engineers to interact with claude code, well, speaking as a senior engineer I'd rather just use claude code directly. | | |
| ▲ | eru 4 hours ago | parent [-] | | Linting, compiler warnings and automated tests have helped a lot with the grunt work of code review in the past. We can use AI these days to add another layer. |
| |
| ▲ | CuriouslyC 4 hours ago | parent | prev [-] | | Except that when you have 10 PRs out, it takes longer for people to get to them, so you end up backlogged. | | |
| ▲ | zmmmmm 3 hours ago | parent [-] | | And when the PR you never even read because the AI wrote it gets bounced back you with an obscure question 13 days later ..... you're not going to be well positioned to respond to that. |
|
|
|
| ▲ | simonw 4 hours ago | parent | prev | next [-] |
| This is one of the reasons I'm so interested in sandboxing. A great way to reduce the need for review is to have ways of running code that limit the blast radius if the code is bad. Running code in a sandbox can mean that the worst that can happen is a bad output as opposed to a memory leak, security hole or worse. |
| |
| ▲ | MeetingsBrowser 4 hours ago | parent | next [-] | | Isn’t “bad output” already worst case? Pre-LLMs correct output was table stakes. You expect your calculator to always give correct answers, your bank to always transfer your money correctly, and so on. | |
| ▲ | KnuthIsGod 4 hours ago | parent | prev [-] | | And if the bad output leads to a decision maker making a bad decision, that takes down your company or kills your relative ? | | |
| ▲ | riffraff 3 hours ago | parent [-] | | The sandbox in question was to absorb shrapnel from explosions, clearly |
|
|
|
| ▲ | jbrozena22 4 hours ago | parent | prev [-] |
| I think the problem is the shape of review processes. People higher up in the corporate food chain are needed to give approval on things. These people also have to manage enormous teams with their own complexities. Getting on their schedule is difficult, and giving you a decision isn't their top priority, slowing down time to market for everything. So we will need to extract the decision making responsibility from people management and let the Decision maker be exclusively focused on reviewing inputs, approving or rejecting. Under an SLA. My hypothesis is that the future of work in tech will be a series of these input/output queue reviewers. It's going to be really boring I think. Probably like how it's boring being a factory robot monitor. |