Remix.run Logo
verdverm 5 hours ago

Here's a good thread over 1+ month, as each model comes out

https://bsky.app/profile/pekka.bsky.social/post/3meokmizvt22...

tl;dr - Pekka says Arc-AGI-2 is now toast as a benchmark

Aperocky 4 hours ago | parent [-]

If you look at the problem space it is easy to see why it's toast, maybe there's intelligence in there, but hardly general.

tasuki an hour ago | parent | next [-]

> maybe there's intelligence in there, but hardly general.

Of course. Just as our human intelligence isn't general.

verdverm 4 hours ago | parent | prev [-]

the best way I've seen this describes is "spikey" intelligence, really good at some points, those make the spikes

humans are the same way, we all have a unique spike pattern, interests and talents

ai are effectively the same spikes across instances, if simplified. I could argue self driving vs chatbots vs world models vs game playing might constitute enough variation. I would not say the same of Gemini vs Claude vs ... (instances), that's where I see "spikey clones"

Aperocky 4 hours ago | parent [-]

You can get more spiky with AIs, whereas with human brain we are more hard wired.

So maybe we are forced to be more balanced and general whereas AI don't have to.

verdverm 4 hours ago | parent [-]

I suspect the non-spikey part is the more interesting comparison

Why is it so easy for me to open the car door, get in, close the door, buckle up. You can do this in the dark and without looking.

There are an infinite number of little things like this you think zero about, take near zero energy, yet which are extremely hard for Ai

gowld an hour ago | parent [-]

You are asking a robotics question, not an AI question. Robotics is more and less than AI. Boston Dynamics robots are getting quite near your benchmark.