| ▲ | mNovak 3 hours ago | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
I'm excited for the big jump in ARC-AGI scores from recent models, but no one should think for a second this is some leap in "general intelligence". I joke to myself that the G in ARC-AGI is "graphical". I think what's held back models on ARC-AGI is their terrible spatial reasoning, and I'm guessing that's what the recent models have cracked. Looking forward to ARC-AGI 3, which focuses on trial and error and exploring a set of constraints via games. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | 3 minutes ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
| [deleted] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | causal 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Agreed. I love the elegance of ARC, but it always felt like a gotcha to give spatial reasoning challenges to token generators- and the fact that the token generators are somehow beating it anyway really says something. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | throw310822 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
The average ARC AGI 2 score for a single human is around 60%. "100% of tasks have been solved by at least 2 humans (many by more) in under 2 attempts. The average test-taker score was 60%." | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | colordrops 3 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||||||||||||||
Wouldn't you deal with spatial reasoning by giving it access to a tool that structures the space in a way it can understand or just is a sub-model that can do spatial reasoning? These "general" models would serve as the frontal cortex while other models do specialized work. What is missing? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||