Remix.run Logo
adamzwasserman 17 hours ago

This is the strongest point in the thread. The article treats poverty, climate, and markets as though the obstacle is insufficient model capacity. But these systems contain agents with values and motivations who actively resist interventions. A billion-parameter model of a system whose components are trying to game the model will never be a theory of that system. The agents will simply route around it.

More broadly, the article assumes that scaling model capacity will eventually bridge the gap between prediction and understanding. I have pre-registered experiments on OSF.io that falsify the strong scaling hypothesis for LLMs: past a certain point, additional parameters buy you better interpolation within the training distribution without improving generalization to novel structure. This shouldn't surprise anyone. If the entire body of science has taught us anything at all, it is that regularity is only ever achieved at the price of generality. A model that fits everything predicts nothing.

The author gestures at mechanistic interpretability as the path from oracle to science. But interpretability research keeps finding that what these models learn are statistical regularities in training data, not causal structure. Exactly what you'd expect from a compression algorithm. The conflation of compression with explanation is doing a lot of quiet work in this essay.