which bits of this do you think llm based agents can't do?

LLMs by their nature are not goal orientated (this is fundamental difference of reinforcement learning vs neural networks for example). So a human will have the, let's say, the ultimate goal of creating value with a web application they create ("save me time!"). The LLM has no concept of that. It's trying to complete a spec as best it can with no knowledge of the goal. Even if you tell it the goal it has no concept of the process to achieve or confirm the goal was attained - you have to tell it that.

▲

interstice 2 hours ago | parent | prev | next [-]

Not get stuck on an incorrect train of thought, not ignore core instructions in favour of training data like breaking naming conventions across sessions or long contexts, not confidently state "I completely understand the problem and this will definitely work this time" for the 5th time without actually checking. I could go on.

▲

ModernMech 2 hours ago | parent | prev [-]

The main thing they cannot do is be held accountable for any decisions, which makes them not trustworthy.

▲

vbezhenar 2 hours ago | parent [-]

This is not correct. They can say "sorry" which makes them as accountable as ordinary developer.

▲

interstice 2 hours ago | parent | next [-]

I've found recent versions of Claude and codex to be reluctant in this regard. They will recognise the problem they created a few minutes ago but often behave as if someone else did it. In many ways that's true though, I suppose.

	▲	bee_rider 44 minutes ago \| parent [-]
		Does it do this for really cut and dry problems? I’ve noticed that ChatGPT will put a lot of effort into (retroactively) “discovering” a basically-valid alternative interpretation of something it said previously, if you object on good grounds. Like it’s trying to evade admitting that it made a mistake, but also find some say to satisfy your objection. Fair enough, if slightly annoying. But I have also caught it on straightforward matters of fact and it’ll apologize. Sometimes in an over the top fashion…

▲

bluefirebrand 2 hours ago | parent | prev [-]

That's not what accountability is