Remix.run Logo
ehnto 10 hours ago

They still fail in the real world, where a single failure can be highly consequential. AI coding is lucky it has early failure modes, pretty low consequence. But I don't see how that looks for an autonomous management agent with arbitrary metrics as goals.

Anyone doing AI coding can tell you once an agent gets on the wrong path, it can get very confused and is usually irrecoverable. What does that look like in other contexts? Is restarting the process from scratch even possible in other types of work, or is that unique to only some kinds of work?