Apologies, but your information is either outdated from lack of experience with the latest frontier models, or you don't realize the fact that 99.9% of the work you do is not novel in all capacities. Have you only used Copilot, or something? Because that's what it sounds like. Since the performance of the latest models (Opus 4.6 max-effort, gpt-5.3-Codex) is nothing short of astonishing.
Real-world example: Claude isn't familiar with the latest Zig, so I had it write a language guide for 0.15.2 (here: https://gist.github.com/pmarreck/44d95e869036027f9edf332ce9a...) which pointed out all the differences, and that's been extremely helpful in having me not even have to touch a line of code to do the updates.
On top of that, for any Zig dependency I pull in which is written to an earlier version, I have forked it and applied these updates correctly (or it has, under my guidance, really), 100% of the time.
On the off chance that guide is not in its context, it has seen the expected warning or error message, googled it, and done the correct correction 100% of the time. Which is exactly what a human would do.
Let's play the falsifiability game: Find me a real-world example of an upgrade to a newer API from the just-previous-to-that API that a modern LLM will fail to do correctly. Your choice of beer or coffee awaits you if you provide a link to it.