Remix.run Logo
bhouston 3 days ago

> so you instead end up with indirect measurements that assume a Gaussian distribution.

100%. I was going to write something similar.

> If you look at board game Elo ratings (poor test for intelligence but we'll ignore that), they do not follow a Gaussian distribution, even though Elo assumes a Gaussian distribution for game outcomes (but not the population). So that's good evidence that aptitude/skill in intellectual subjects isn't Gaussian (but it's also not Pareto iirc).

Interesting, yeah, Elo is quite interesting. And one can view hiring in a company as something like selecting people for Elo above a certain score, but with some type of error distribution on top of that, probably Gaussian error. So what does a one sided Elo distribution look like with gaussian error in picking people above that Elo limit?

KK7NIL 3 days ago | parent [-]

Lichess has public population data (they use a modified version of Glicko-2 which is basically an updated version of Elo's system): https://lichess.org/stat/rating/distribution/blitz

It's basically a Gaussian with a very long right tail.

Big caveat here is that these are the ratings of weekly active players. If we instead include casual players, I suspect we'd have something resembling a pareto distribution.

doctorpangloss 3 days ago | parent | next [-]

The big caveat is that it's trivial to measure the AIC, BIC and other quality of fit measurements for a distribution. If you think it's so and so distribution, go for it. In my experience in this specific case of chess rankings and in the broader case of test scores, skew-normal and log-normal have worse fits than plain Guassian.

I have no idea why you would believe increasing the population would make this Gaussian distribution look Pareto, when the exact opposite is true - increasing populations make things look more Gaussian - in all natural circumstances.

KK7NIL 3 days ago | parent [-]

I was conjecturing that the distribution would be closer to Pareto for everyone (including people who've never learned how to play chess), hence why I said that "active players" is a big caveat.

> increasing populations make things look more Gaussian - in all natural circumstances.

This is just not the case, there's plenty of "natural circumstances" where populations have non-Gaussian distributions.

Perhaps you meant a specific type of population, like chess ratings? I'd be interested in seeing what you find there, but all I've found shows significantly distorted tails (not to mention a skew from 1500).

JackFr 3 days ago | parent | prev [-]

Good question - do the bad players play less because they are bad, or are they bad because they play less?

bhouston 3 days ago | parent [-]

> Good question - do the bad players play less because they are bad, or are they bad because they play less?

Both for sure. If you don't practice you will never rise much about bad. But if you are bad and not progressing you won't play much because it isn't rewarding to lose.

One needs to almost figure out those with low ELO ratings, what is their history compared to the number of games played and see if they were following an expected ELO progression.

I wonder if you can estimate with any accuracy where a player will eventually plateau given just a small-ish sampling of their first games. Basically estimate the trajectory based on how they start and progress. This would be interesting. Given how studied Chess is, I expect this is already done to some extent somewhere.