Remix.run Logo
rachofsunshine 2 days ago

What makes Goodhart's Law so interesting is that you transition smoothly between two entirely-different problems the more strongly people want to optimize for your metric.

One is a measurement problem, a statement about the world as it is: an engineer who can finish such-and-such many steps of this coding task in such-and-such time has such-and-such chance of getting hired. The thing you're measuring isn't running away from you or trying to hide itself, because facts aren't conscious agents with the goal of misleading you. Measurement problems are problems of statistics and optimization, and their goal is a function f: states -> predictions. Your problems are usually problems of inputs, not problems of mathematics.

But the larger you get, and the more valuable gaming your test is, the more you leave that measurement problem and find an adversarial problem. Adversarial problems are at least as difficult as your adversary is intelligent, and they can sometimes be even worse by making your adversary the invisible hand of the market. You don't live in the world of gradient descent anymore, because the landscape is no longer fixed. You now live in the world of game theory, and your goal is a function f: (state) x (time) x (adversarial capability) x (history of your function f) -> predictions.

It's that last, recursive bit that really makes adversarial problems brutal. Very simple functions can rapidly result in extremely deep chaotic dynamics once you allow even the slightest bit of recursion - even very nice functions like f(x) = 3.5x(1-x) become writhing ergodic masses of confusion.

pixl97 a day ago | parent | next [-]

I would also assume Russell's paradox needs added in here too. Humans can and do hold sets of conflicting information, it is my theory that conflicts have an informational/processing cost to manage. In benchmark gaming you can optimize the processing speed by removing the conflicting information but you lose real world reliability metrics.

visarga a day ago | parent | prev | next [-]

Well said, the problem with recursion is that it constructs its own context as it goes, rewrites its rules, and you cannot predict it statically, without forward execution. It's why we have the halting problem. Recursion is irreducible. A benchmark is a static dataset, it does not capture the self constructive nature of recursion.

bwfan123 a day ago | parent | prev [-]

nice comment, a reason why ML approaches may struggle in trading markets where other agents are also competing with you possibly using similar algos. or self-driving which involves other agents who could be adversarial. just training on past data is not sufficient as existing edges are competed away and new edges keep arising out of nowhere.