▲ | ants_everywhere 3 days ago | |
Okay thanks I'll try that. > have run into Claude modifying problem statements, adding axioms, etc. Same here. I've thought about creating a utility that tells Claude it has to keep going until a test exits with nonzero status. But I'm concerned Claude would just fake everything to make the test pass. |