Remix.run Logo
starchild3001 4 days ago

I’m baffled by claims that AI has “hit a wall.” By every quantitative measure, today’s models are making dramatic leaps compared to those from just a year ago. It’s easy to forget that reasoning models didn’t even exist a year back!

IMO Gold, Vibe coding with potential implications across sciences and engineering? Those are completely new and transformative capabilities gained in the last 1 year alone.

Critics argue that the era of “bigger is better” is over, but that’s a misreading. Sometimes efficiency is the key, other times extended test-time compute is what drives progress.

No matter how you frame it, the fact is undeniable: the SoTA models today are vastly more capable than those from a year ago, which were themselves leaps ahead of the models a year before that, and the cycle continues.

behnamoh 4 days ago | parent | next [-]

it has become progressively easier to game benchmarks in order to appear higher in rankings. I’ve seen several models that claimed they were the best in software engineering only to be disappointed by them not figuring out the most basic coding problems. In comparison, I’ve seen models that don’t have much hype, but are rock solid.

When people say AI has hit a wall, they mainly talk about OpenAI losing its hype and grip on the state of the art models.

goatlover 4 days ago | parent | prev | next [-]

Is the stated fact undeniable? Because a lot of people have been contesting it. This reads like PR to counter the widespread GPT-5 criticism and disappointment.

4 days ago | parent | next [-]
[deleted]
Workaccount2 4 days ago | parent | prev [-]

To be fair, the bull of GPT-5 complaining comes from a vocal minority pissed that their best friend got swapped out. The other minority is unhinged AI fanatics thinking GPT-5 would be AGI.

oinfoalgo 4 days ago | parent | prev | next [-]

I don't think it is that surprising.

It will become harder and harder for the average person to gain from newer models.

My 75 year old father loves using Sonnet. He is not asking anything though that he would be able to tell Opus is "better". The answers he gets from the current model are good enough. He is not exactly using it to probe the depths of statistical mechanics.

My father is never going to vibe code anything no matter how good the models get.

I don't think AGI would even give much different answers to what he asks.

You have to ask the model something that allows the latest model to display its improvements. I think we can see, that is just not something on the mind of the average user.

starchild3001 4 days ago | parent [-]

Correct. People claim these models "saturate" yet what saturates faster is our ability to grasp what these models are capable of.

I, for one, cannot evaluate the strength of an IMO gold vs IMO bronze models.

Soon coding capabilities might also saturate. It might all become a matter of more compute (~ # iterations), instead of more precision (~ % getting it right the first time), as the models become lightning speed, and they gain access to a playground.

Workaccount2 4 days ago | parent | prev | next [-]

The prospect of AI not hitting a wall is terrifying to many people for understandable reasons. In situations like this you see the full spectrum of coping mechanisms come to the surface.

4 days ago | parent [-]
[deleted]
jsjdkdlldxlxk 4 days ago | parent | prev [-]

thanks OpenAI, very cool!

qotgalaxy 4 days ago | parent [-]

[dead]