▲ | Voloskaya 2 days ago | |
I don't necessarily disagree, my point is more that today you can realistically let an agent do several steps and use several tools, following a plan of it's own, before doing a manual review (e.g. Claude Code followed by a PR review). After all an intern has agency, even if I'm going to double check everything they do. GPT-3, while being impressive at the time, was too bad to even let it do that, it would break after 1 or 2 steps, so letting it do anything by itself would have been a waste of time where the human in the loop would always have to re-do everything. It's planning ability was too bad and hallucinations way to frequent to be useful in those scenarios. |