Actually no, we have it up to 3 attempts. In fact, Opus 4 failed on 36/50 tests on the first attempt, but it was REALLY good at nailing the second attempt after receiving error feedback.