Remix.run Logo
daemonologist 7 hours ago

I theorize that this kind of thing - verification approaches that do already exist but which a human would not usually reach for - is a result of RLVR. I've noticed Claude Opus a couple of times trying to use `nm` to check that its changes made it into the binary (and often tying itself into knots in the process), and have stuffed some extra coaxings along the lines of "don't try to be clever" into its prompt.