| ▲ | soulofmischief 2 hours ago | |
2025 has been a wild year for agentic coding models. Cutting-edge models in January 2025 don't hold a candle to cutting edge models in December 2025. Just the jump from Sonnet 3.5 to 3.7 to 4.5, and Opus 4.5 has been pretty massive in terms of holistic reasoning, deep knowledge as well as better procedural and architectural adherence. GPT-5 Pro convinced me to pay $200/mo for an OpenAI subscription. Regular 5.2 models, and 5.2 codex, are leagues better than GPT-4 when it comes to solving problems procedurally, using tools, and deep discussion of scientific, mathematic, philosophical and engineering problems. Models have increasingly longer context, especially some Google models. OpenAI has released very good image models, and great editing-focused image models in general have been released. Predictably better multimodal inference over the short term is unlocking many cool near-term possibilities. Additionally, we have seen some incredible open source and open weight models released this year. Some fully commercially viable without restriction. And more and more smaller TTS/STT projects are in active development, with a few notable releases this year. Honestly, the landscape at the end of the year is impressive. There has been great work all over the place, almost too much to keep up with. I'm very interested in the Genie models and a few others. For an idea: At the beginning of the year, I was mildly successful getting at coding models to make changes in some of my codebases, but the more esoteric problems were out of reach. Progress in general was deliberate and required a lot of manual intervention. By comparison, in the last week I've prototyped six applications at levels that would take me days to weeks individually, often developing multiple at the same time, monitoring agentic workflows and intervening only when necessary, relying on long preproduction phases with architectural discussions and development of documentation, requirements, SDDs... and detailed code review and refactoring processes to ensure adherence to constraints. I'm morphing from a very busy solo developer into a very busy product manager. | ||
| ▲ | orwin 44 minutes ago | parent | next [-] | |
> Just the jump from Sonnet 3.5 to 3.7 to 4.5, and Opus 4.5 has been pretty massive in terms of holistic reasoning, deep knowledge as well as better procedural and architectural adherence. I don't really agree. Aside from how it handled frontend code, changes in Sonnet did not truly impact my overall productivity (from Sonnet 3.7 to 4 to 4.5, i did not try 3.5). Opus 4.5/Codex 5.2 are when the changes truly happenned for me (and i'm still a bit distrustfull of Codex 5.2, but i use it basically to help me during PRs). | ||
| ▲ | foldr an hour ago | parent | prev [-] | |
>By comparison, in the last week I've prototyped six applications at levels that would take me days to weeks individually [...] I don't doubt that the models have got better, but you can go back two or three years and find people saying the exact same stuff about the latest models back then. | ||