Remix.run Logo
noosphr 2 days ago

All llms still suck too much to trust them with basic tasks without human in the loop. The only people who don't realize this are the ones whose paycheck depends on them not understanding it.

Voloskaya 2 days ago | parent [-]

I don't necessarily disagree, my point is more that today you can realistically let an agent do several steps and use several tools, following a plan of it's own, before doing a manual review (e.g. Claude Code followed by a PR review). After all an intern has agency, even if I'm going to double check everything they do.

GPT-3, while being impressive at the time, was too bad to even let it do that, it would break after 1 or 2 steps, so letting it do anything by itself would have been a waste of time where the human in the loop would always have to re-do everything. It's planning ability was too bad and hallucinations way to frequent to be useful in those scenarios.