Remix.run Logo
itay-maman 5 hours ago

Something that caught my eye from the announcement:

> GPT‑5.3‑Codex is our first model that was instrumental in creating itself. The Codex team used early versions to debug its own training

I'm happy to see the Codex team moving to this kind of dogfooding. I think this was critical for Claude Code to achieve its momentum.

codethief 3 hours ago | parent | next [-]

Sounds like the researchers behind https://ai-2027.com/ haven't been too far off so far.

cootsnuck 2 hours ago | parent | next [-]

We'll see. The first two things that they said would move from "emerging tech" to "currently exists" by April 2026 are:

- "Someone you know has an AI boyfriend"

- "Generalist agent AIs that can function as a personal secretary"

I'd be curious how many people know someone that is sincerely in a relationship with an AI.

And also I'd love to know anyone that has honestly replaced their human assistant / secretary with an AI agent. I have an assistant, they're much more valuable beyond rote input-output tasks... Also I encourage my assistant to use LLMs when they can be useful like for supplementing research tasks.

Fundamentally though, I just don't think any AI agents I've seen can legitimately function as a personal secretary.

Also they said by April 2026:

> 22,000 Reliable Agent copies thinking at 13x human speed

And when moving from "Dec 2025" to "Apr 2026" they switch "Unreliable Agent" to "Reliable Agent". So again, we'll see. I'm very doubtful given the whole OpenClaw mess. Nothing about that says "two months away from reliable".

zozbot234 an hour ago | parent | next [-]

> Someone you know has an AI boyfriend

MyBoyfriendIsAI is a thing

> Generalist agent AIs that can function as a personal secretary

Isn't that what MoltBot/OpenClaw is all about?

So far these look like successful predictions.

ainch an hour ago | parent [-]

Moltbot is an attempt to do that. Would you hire it as a personal secretary and entrust all your personal data to it?

danpalmer an hour ago | parent [-]

Only people who haven't had a secretary would think it's a personal secretary.

Like, it can't even answer the phone.

Rudybega 2 hours ago | parent | prev [-]

I think they immediately corrected their median timelines for takeoff to 2028 upon releasing the article (I believe there was a math mistake or something initially), so all those dates can probably be bumped back a few months. Regardless, the trend seems fairly on track.

JackYoustra 20 minutes ago | parent | prev | next [-]

> researchers

that's certainly one way to refer to Scott Alexander

YawningAngel 2 hours ago | parent | prev [-]

I don't think generative AI is even close to making model development 50% faster

aurareturn 5 hours ago | parent | prev [-]

More importantly, this is the early steps of a model self improving itself.

Do we still think we'll have soft take off?

mrandish 4 hours ago | parent | next [-]

> Do we still think we'll have soft take off?

There's still no evidence we'll have any take off. At least in the "Foom!" sense of LLMs independently improving themselves iteratively to substantial new levels being reliably sustained over many generations.

To be clear, I think LLMs are valuable and will continue to significantly improve. But self-sustaining runaway positive feedback loops delivering exponential improvements resulting in leaps of tangible, real-world utility is a substantially different hypothesis. All the impressive and rapid achievements in LLMs to date can still be true while major elements required for Foom-ish exponential take-off are still missing.

rahulyc 3 hours ago | parent [-]

Yes, but also you'll never have any early evidence of the Foom until the Foom itself happens.

janalsncm 2 hours ago | parent [-]

If only General Relativity had such an ironclad defense of being as unfalsifiable as Foom Hypothesis is. We could’ve avoided all of the quantum physics nonsense.

quinncom 5 hours ago | parent | prev | next [-]

Exponential growth may look like a very slow increase at first, but it's still exponential growth.

janalsncm 3 hours ago | parent | next [-]

Sigmoids may look like exponential growth at first, until they saturate. Early growth alone cannot distinguish between them.

gf000 3 hours ago | parent | prev [-]

If it's exponential growth. It may just as well be some slow growth and continue to be so.

aaaalone 4 hours ago | parent | prev | next [-]

I'm only saying no to keep optimistic tbh

It feels crazy to just say we might see a fundamental shift in 5 years.

But the current addition to compute and research etc. def goes in this direction I think.

thrance 4 hours ago | parent | prev | next [-]

I think the limiting factor is capital, not code. And I doubt GPTX is anymore competent at raising funds than the other, fleshy, snake oilers...

8note 4 hours ago | parent | prev | next [-]

making the specifications is still hard, and checking how well results match against specifications is still hard.

i dont think the model will figure that out on its own, because the human in the loop is the verification method for saying if its doing better or not, and more importantly, defining better

reducesuffering 5 hours ago | parent | prev [-]

This has already been going on for years. It's just that they were using GPT 4.5 to work on GPT 5. All this announcement mean is that they're confident enough in early GPT 5.3 model output to further refine GPT 5.3 based on initial 5.3. But yes, takeoff will still happen because of this recursive self improvement works, it's just that we're already past the inception point.

manmal 2 hours ago | parent | next [-]

I guess humans were involved in all that, so how is that anything but tool use?

mirsadm 5 hours ago | parent | prev [-]

I can't tell if this is a serious conversation anymore.

reducesuffering 3 hours ago | parent [-]

“Best start believing in science fiction stories. You're in one.”

https://x.com/TheZvi/status/2017310187309113781