GPT-5 Pro catches more bugs in my code than I do now. It is very very good.

LLMs are pretty consistent about what types of tasks they are good at, and which they are bad it. That means people can learn when to use them, and when to avoid them. You really don't have to be so black-and-white about it. And if you are checking the LLM's results, you have nothing to worry about.

Needing to verify the results does not negate the time savings either when verification is much quicker than doing a task from scratch.

My code is definitely of higher quality now that I have GPT-5 Pro review all my changes, and then I review my code myself as well. It seems obvious to me that if you care, LLMs can help you produce better code. As always, it is only people who are lazy who suffer. If you care about producing great code, then LLMs are a brilliant tool to help you with just that, in less time, by helping with research, planning, and review.

▲

tolmasky 3 days ago | parent [-]

This doesn't really address the point that is currently being argued I think, so much so that I think your comment is not even in contention with mine (perhaps you didn't intend it to be!). But for lack of a better term, you are describing a "closed experience". You are (to some approximation) assuming the burden of your choices here. You are applying the tool to your work, and thus are arguably "qualified" to both assess the applicability of the tool to the work, and to verify the results. Basically, the verification "scales" with your usage. Great.

The problem that OP is presenting is that, unlike in your own use, the verification burden from this "open source" usage is not taken on by the "contributors", but instead "externalized" to maintainers. This does not result in the same "linear" experience you have, their experience is asymmetric, as they are now being flooded with a bunch of PRs that (at least currently) are harder to review than human submissions. Not to mention that also unlike your situation, they have no means to "choose" not to use LLMs if they for whatever reason discover it isn't a good fit for their project. If you see something isn't a good fit, boom, you can just say "OK, I guess LLMs aren't ready for this yet." That's not a power maintainers have. The PRs will keep coming as a function of the ease to create them, not as a function of their utility. Thus the verification burden does not scale with the maintainer's usage. It scales with the sum of everyone who has decided they can ask an LLM to go "help" you. That number both larger and out of their control.

The main point of my comment was to say that this situation is not only to be expected, but IMO essential and inseparable from this kind of use, for reasons that actually follow directly from your post. When you are working on your own project, it is totally reasonable to treat the LLM operator as qualified to verify the LLMs outputs. But the opposite is true when you are applying it to someone else's project.

> Needing to verify the results does not negate the time savings either when verification is much quicker than doing a task from scratch.

This is of course only true because of your existing familiarity with of the project you are working on. This is not a universal property of contributions. It is not "trivial" for me to verify a generated patch in a project I don't understand, for reasons ranging from things as simple as the fact that I have no idea what the code contribution guidelines are (who am I to know if I am even following the style guidelines) to things as complicated as the fact that I may not even be familiar with the programming language the project is written in.

> And if you are checking the LLM's results, you have nothing to worry about.

Precisely. This is the crux of the issue -- I am saying that in the contribution case, it's not even about whether you are checking the results, it's that you arguably can't meaningfully check the results (unless you of course essentially put in nearly the same amount of work as just writing it from scratch).

It is tempting to say "But isn't this orthogonal to LLMs? Isn't this also the case with submitting PRs you created yourself?" No! It is qualitatively different. Anyone who has ever submitted a meaningful patch to a project they've never worked on before has had the experience of having to familiarize themselves with the relevant code in order to create said patch. The mere act of writing the fix organically "bootstraps" you into developing expertise in the code. You will if nothing else develop an opinion on the fix you chose to implement, and thus be capable of discussing it after you've submitted it. You, the PR submitter, will be worthwhile to engage with and thus invest time in. I am aware that we can trivially construct hypothetical systems where AI agents are participating in PR discussions and develop something akin to a long term "memory" or "opinion" -- but we can talk about that experience if and when it ever comes into being, because that is not the current lived experience of maintainers. It's just a deluge of low quality one-way spam. Even the corporations that are specifically trying to implement this experience just for their own internal processes are not particularly... what's a nice way to put this, "satisfying" to work with, and that is for a much more constrained environment, vs. "suggesting valuable fixes to any and all projects".

	▲	rhdunn 3 days ago \| parent \| next [-]
		I'm not advocating that the verification should be on the maintainer. It should definitely be on the contributor/submitter to verify that what they are submitting is correct to the best of their abilities. This applies if the reporter found the bug themselves, used a static analysis tool like Coverity, used a fuzzing tool, used valgrind or similar, used an LLM, or some other mechanism to identify the issue. In each case the reporter needs to at a minimum check if what they found is actually an issue and ideally provide a reproducible test case ("this file causes the application to crash", etc.), logs if relevant, etc.
	▲	sothatsit 2 days ago \| parent \| prev [-]
		I was arguing against your dismissal of the value proposition of LLMs. I wasn't arguing about the case of open-source maintainers getting spammed by low-quality issues and PRs (where I think we agree on a lot of points). The way that you argued that the value proposition of LLMs makes no sense takes a really black-and-white view of modern AI. There are actually a lot of tasks where verification is easier than doing the task yourself, even in areas where you are not an expert. You just have to actually do the verification (which is the primary problem with open-source maintainers getting spammed by people who do not verify anything). For example, I have recently been writing a proxy for work, but I'm not that familiar with networking setups. But using LLMs, I've been able to get to a robust solution that will cover our use-cases. I didn't need to be an expert in networking. My experience in other areas of computer science combined with LLMs to help me research let me figure out how to get our proxy to work. Maybe there is some nuance I am missing, but I can verify that the proxy correctly gets the traffic and I can figure out where it needs to go, and that's enough to make progress. There is some academic purity lost in this process of using LLMs to extend the boundary of what you can accomplish. This has some pretty big negatives, such as allowing people with little experience to create incredibly insecure software. But I think there are a lot more cases where if you verify the results you get, and you don't try to extend too far past your knowledge, it gives people great leverage to do more. This is to say, you don't have to be an expert to use an LLM for a task. But it does help a lot to have some knowledge about related topics at least, to ground you. Therefore, I would say LLMs can greatly expand the scope of what you can do, and that is of great value (even if they don't help you do literally everything with a high likelihood of success). Additionally, coding agents like Claude Code are incredible at helping you get up-to-speed with how an existing codebase works. It is actually one of the most amazing use-cases for LLMs. It can read a huge amount of code and break it down for you so you can start figuring out where to start. This would be of huge help when trying to contribute to someone else's repository. LLMs can also help you with finding where to make a change, writing the patch, setting up a test environment to verify the patch, looking for project guidelines/styleguides to follow, helping you to review your patch against those guidelines, and helping you to write the git commit and PR description. There's so many areas where they can help in open-source contributions. The main problem in my eyes is people that come to a project and make a PR because they want the "cred" of contributing with the least possible effort, instead of because they have an actual bug/feature they want to fix/add to the project. The former is noise, but the latter always has at least one person who benefits (i.e., you).