The core problem is actually very simple. Education studies do not measure what they claim to measure. When they say, "education outcomes improve when..." they usually mean the pass rate, i.e. they only measured a signal among the bottom 20% of students. When they say, "test scores improve when..." they are, at best, measuring up to the 90th percentile. When they say, "the white/black attainment gap," or "socioeconomic disadvantages," they're usually just fishing for funding money, and their study will not actually attempt to measure either of those things. From a review of the literature on No Child Left Behind (NCLB) in 2015: > Only one study specifically examined the achievement gap for students from low socioeconomic backgrounds (Hampton & Gruenert, 2008) despite NCLB’s stated commitment to improving education for children from low-income families. African American students were often mentioned in studies of general student achievement but none of the reviewed studies focused specifically on the effects of NCLB for this subgroup. Again, this is a curious gap in the research considering the law’s emphasis on narrowing the Black-White achievement gap. Other groups of students underrepresented in the research on NCLB include gifted students, students with vision impairments, and English proficient minority students.

("A Review of the Empirical Literature on No Child Left Behind From 2001 to 2010", Husband & Hunt, 2015)

Everything you see going wrong is downstream of this. Yes, harmful ideologies have done a lot of damage to the education system, but it could easily survive this if we had actual signifiers of success.

▲

empath75 4 days ago | parent [-]

Test scores do not measure directly what they purport to measure. They are a proxy, and when you have a system designed to optimize for a proxy of the thing you want to improve, then the system will always find ways to exploit the difference between the proxy and the underlying thing.

You can call it "juking the stats" or "overfitting", or whatever.

https://sohl-dickstein.github.io/2022/11/06/strong-Goodhart....

	▲	programjames 4 days ago \| parent [-]
		We can only ever have proxies. The issue (at least here) isn't Goodharting, it's that the proxy does not even measure what the researchers claim it measures. For example, pretty much the only study of its name, "High-Achieving Students in the Era of NCLB" (Lovelass, Parkas, Duffett) uses the NAEP, which only releases 10th, 20th, ..., 90th percentile scores. Gifted programs are ~95th percentile. Or, look at this lovely graph of 36 ACT scores: https://en.m.wikipedia.org/wiki/File:Percent_ACT_Composite_S... When it increases by 30x in 30 years, it becomes very apparent that standardized tests are simultaneously being gamed by the students and the testwriters. Students (and school districts and states) are choosing the "easier" tests that make them look proficient, so testwriters are making their tests easier to get a bigger market share. And all along, the tests that only care about separating out the top show declining scores every year... - [AMC Historical Results](https://artofproblemsolving.com/wiki/index.php/AMC_historica...)