How do you define "correct" code?
The code that gets stuff done instead of beating around the bush making unxpected errors
i suspect this is highly dependent on what you're working on
from my experience if you give the models a way to self-verify correctness they succeed basically 100% of the time