▲ | lieret 5 days ago | |
[Also on the SWE-bench team] Part of the reason why this didn't surface earlier was that it only seems to affect more recent models, maybe the result of reward hacking during posttraining. We're currently working on making trajectories easier to access for everyone through a web tool (rather than having to download things from aws) to get even more eyes on the trajectories. The interface will also include search & LM inspection tools to specifically look for anything that might qualify as cheating. |