▲ | mritchie712 a day ago | |
looks like this is one-shot generation right? I wonder how much the results would change with a more agentic flow (e.g. allow it to see an error or select * from the_table first). sonnet seems particularly good at in-session learning (e.g. correcting it's own mistakes based on a linter). | ||
▲ | _peregrine_ a day ago | parent [-] | |
Actually no, we have it up to 3 attempts. In fact, Opus 4 failed on 36/50 tests on the first attempt, but it was REALLY good at nailing the second attempt after receiving error feedback. |