Remix.run Logo
benreesman 6 days ago

People will go to extremely great lengths to debate the appropriate analogy for how these things work, which is fun I guess but in a "get high with a buddy" sense at least to my taste.

Some of how they work is well understood (a lot now, actually), some of the outcomes are still surprising.

But we debate both the well understood parts and the surprising parts both with the wrong terminology borrowed from pretty dubious corners of pop cognitive science, and not with terminology appropriate to the new and different thing! It's nothing like a brain, it's a new different thing. Does it think or reason? Who knows pass the blunt.

They do X performance on Y task according to Z eval, that's how you discuss ML model capability if you're persuing understanding rather than fundraising or clicks.

Vegenoid 6 days ago | parent [-]

While I largely agree with you, more abstract judgements must be made as the capabilities (and therefore tasks being completed) become increasingly general. Attempts to boil human intellectual capability down to "X performance on Y task according to Z eval" can be useful, but are famously incomplete and insufficient on their own for making good decisions about which humans (a.k.a. which general intelligences) are useful and how to utilize and improve them. Boiling down highly complex behavior into a small number of metrics loses a lot of detail.

There is also the desire to discover why a model that outperforms others does so, so that the successful technique can be refined and applied elsewhere. This too usually requires more approaches than metric comparison.