▲ | empath75 4 days ago | |
Test scores do not measure directly what they purport to measure. They are a proxy, and when you have a system designed to optimize for a proxy of the thing you want to improve, then the system will always find ways to exploit the difference between the proxy and the underlying thing. You can call it "juking the stats" or "overfitting", or whatever. https://sohl-dickstein.github.io/2022/11/06/strong-Goodhart.... | ||
▲ | programjames 4 days ago | parent [-] | |
We can only ever have proxies. The issue (at least here) isn't Goodharting, it's that the proxy does not even measure what the researchers claim it measures. For example, pretty much the only study of its name, "High-Achieving Students in the Era of NCLB" (Lovelass, Parkas, Duffett) uses the NAEP, which only releases 10th, 20th, ..., 90th percentile scores. Gifted programs are ~95th percentile. Or, look at this lovely graph of 36 ACT scores: https://en.m.wikipedia.org/wiki/File:Percent_ACT_Composite_S... When it increases by 30x in 30 years, it becomes very apparent that standardized tests are simultaneously being gamed by the students and the testwriters. Students (and school districts and states) are choosing the "easier" tests that make them look proficient, so testwriters are making their tests easier to get a bigger market share. And all along, the tests that only care about separating out the top show declining scores every year... - [AMC Historical Results](https://artofproblemsolving.com/wiki/index.php/AMC_historica...) |