Remix.run Logo
pixel_popping 7 hours ago

Fable 5 will be genuinely weak compared to what's coming, I mean, we need to remember this is kinda the beginning still, we will genuinely reach a point where all benchmarks will score 99.9%. Think Opus 10, GPT-10... :)

Also Fable 5 isn't "that impressive" as a lot of people have that kind of intelligence since 6 months+ by using combo of models and loops (I scored better on HLE than gpt-5.5 xhigh last January with some good tooling and 6x the cost), but for a lambda Claude Code user, I can see why it looks that good.

Chance-Device 7 hours ago | parent | next [-]

Are you using it or are you just going off benchmarks?

PestoDiRucola 6 hours ago | parent | prev [-]

What makes you think that models will improve with the same pace that they have been improving in the past few years?

petra 5 hours ago | parent [-]

A few reasons: -2.5 years is a pretty short time for a new tech development, even if it fails eventually - usually when a new tech is introduced, the biggest gains happen when the environment is changed to fit it. That takes time: libraries, api's, verification tooling, rl environments, skilling users, etc. - possibility of orders of magnitude hardware cost reduction - Optical. Analog. Rram. Many others. Something will work. And internal improvements in the model architecture. And there's scaling in reasoning time.