These models are so powerful.

It's totally possible to build entire software products in the fraction of the time it took before.

But, reading the comments here, the behaviors from one version to another point version (not major version mind you) seem very divergent.

It feels like we are now able to manage incredibly smart engineers for a month at the price of a good sushi dinner.

But it also feels like you have to be diligent about adopting new models (even same family and just point version updates) because they operate totally differently regardless of your prompt and agent files.

Imagine managing a team of software developers where every month it was an entirely new team with radically different personalities, career experiences and guiding principles. It would be chaos.

I suspect that older models will be deprecated quickly and unexpectedly, or, worse yet, will be swapped out with subtle different behavioral characteristics without notice. It'll be quicksand.

▲

simonw 3 hours ago | parent | next [-]

I had an interesting experience recently where I ran Opus 4.6 against a problem that o4-mini had previously convinced me wasn't tractable... and Opus 4.6 found me a great solution. https://github.com/simonw/sqlite-chronicle/issues/20

This inspired me to point the latest models at a bunch of my older projects, resulting in a flurry of fixes and unblocks.

▲

small_model an hour ago | parent | next [-]

I have a codebase (personal project) and every time there is a new Claude Opus model I get it to do a full code review. Never had any breakages in last couple of model updates. Worried one day it just generates a binary and deletes all the code.

▲

TZubiri an hour ago | parent [-]

No version control?

	▲	small_model 41 minutes ago \| parent [-]
		I was being facetious, I mean one day models might skip the middle man of code and compilation and take your specs and produce an ultra efficent binary.

▲

jauntywundrkind an hour ago | parent | prev | next [-]

From the project description here for your sqlite-chronicle project:

> Use triggers to track when rows in a SQLite table were updated or deleted

Just a note in case its interesting to anyone, sqlite compatible Turso database has CDC, a changes table! https://turso.tech/blog/introducing-change-data-capture-in-t...

▲

petesergeant 2 hours ago | parent | prev [-]

I continue to get great value out of having claude and codex bound together in a loop: https://github.com/pjlsergeant/moarcode

	▲	apitman 2 hours ago \| parent [-]
		They are one, the ring and the dark lord

▲

jama211 3 hours ago | parent | prev | next [-]

Yeah I keep maintaining a specific app I built with gpt 5.1 codex max with that exact model because it continues to work for the requests I send it, and attempts with other models even 5.2 or 5.3 codex seemed to have odd results. If I were superstitious I would say it’s almost like the model that wrote the code likes to work on the code better. Perhaps there’s something about the structure it created though that it finds easier to understand…

▲

seizethecheese 3 hours ago | parent | prev | next [-]

> It feels like we are now able to manage incredibly smart engineers for a month at the price of a good sushi dinner.

In my experience it’s more like idiot savant engineers. Still remarkable.

▲

WarmWash 3 hours ago | parent | prev | next [-]

I have long suspected that a large part of people's distaste for given models comes from their comfort with their daily driver.

Which I guess feeds back to prompting still being critical for getting the most out of a model (outside of subjective stylistic traits the models have in their outputs).

▲

worldsavior 3 hours ago | parent | prev | next [-]

Sushy dinner? What are you building with AI, a calculator?

▲

HardCodedBias 28 minutes ago | parent | prev [-]

"These models are so powerful."

Careful.

Gemini simply, as of 3.0, isn't in the same class for work.

We'll see in a week or two if it really is any good.

Bravo to those who are willing to give up their time to test for Google to see if the model is really there.

(history says it won't be. Ant and OAI really are the only two in this race ATM).