Remix.run Logo
bugglebeetle 6 hours ago

Yeah, it’s hilarious to be having this conversation about MLEs while attributing the bad outcomes to anything other than poorly designed reward functions, i.e. management. If an engineer burned millions on failed training runs because they did a shit job of creating a policy that maximized for the desired outcome, they’d get canned, but that’s just a Tuesday for your average MBA with VC backing.