▲ | blndrt 5 days ago | |
I only had Claude rewrite the domain policies and generic instructions, not the individual task statements. I updated the blog with a link showing the exact changes. So no leakage — it wasn’t solving or hinting at any of the specific test cases, since none of the tasks were ever exposed to it. |