Remix.run Logo
bbarnett 4 days ago

Otherwise, it doesn't measure anything useful.

Oh it measures a useful metric, absolutely, as aspects of an IQ test validate certain types of cognition. Those types of cognition have been found to map to real-world employment of the same.

If an AI is so incapable of performing admirably on an IQ test for those types of cognition, then one thing we're certainly measuring is that it's incapable of handling that 'class' of cognition if the conditions change in minuscule and tiny ways.

And that's quite important.

For example, if the model appears to perform specific work tasks well, related to a class of cognition, then cannot do the same category of cognitive tasks outside of that scope, we're measuring lack of adaptability or true cognitive capability.

It's definitely measuring something. Such as, will the model go sideways with small deviations on task or input? That's a nice start.

azernik 2 days ago | parent [-]

"Those types of cognition have been found to map to real-world employment of the same."

...in humans. That correlation has not been established for LLMs.