| ▲ | Twey 5 hours ago | |
I think that's exactly an example of climbing the abstraction ladder. An agent that's incapable of reframing the current context, given a bad task, will try its best to complete it. An agent capable of generalizing to an overarching goal can figure out when the current objective is at odds with the more important goal. You're correct in that these aren't really ‘coding agents’ any more, though. Any more than software developers are! | ||
| ▲ | daxfohl an hour ago | parent | next [-] | |
Not just the abstraction ladder though. Also the situational awareness ladder, the functionality ladder, and most importantly the trust ladder. I can kind of trust the thing to make code changes because the task is fairly well-defined, and there are compile errors, unit tests, code reviews, and other gating factors to catch mistakes. As you move up the abstraction ladder though, how do I know that this thing is actually making sound decisions versus spitting out well-formatted AIorrhea? At the very least, they need additional functionality to sit in on and contribute to meetings, write up docs and comment threads, ping relevant people on chat when something changes, and set up meetings to resolve conflicts or uncertainties, and generally understand their role, the people they work with and their roles, levels, and idiosyncrasies, the relative importance and idiosyncrasies of different partners, the exceptions for supposed invariants and why they exist and what it implies and when they shouldn't be used, when to escalate vs when to decide vs when to defer vs when to chew on it for a few days as it's doing other things, etc. For example, say you have an authz system and you've got three partners requesting three different features, the combination of which would create an easily identifiable and easily attackable authz back door. Unless you specifically ask AI to look for this, it'll happily implement those three features and sink your company. You can't fault it: it did everything you asked. You just trusted it with an implicit requirement that it didn't meet. It wasn't "situationally aware" enough to read between the lines there. What you really want is something that would preemptively identify the conflicts, schedule meetings with the different parties, get a better understanding of what each request is trying to unblock, and ideally distill everything down into a single feature that unblocks them all. You can't just move up the abstraction ladder without moving up all those other ladders as well. Maybe that's possible someday, but right now they're still just okay coders with no understanding of anything beyond the task you just gave them to do. That's fine for single-person hobby projects, but it'll be a while before we see them replacing engineers in the business world. | ||
| ▲ | FrankRay78 4 hours ago | parent | prev [-] | |
I’ve been creating agents to better navigate the early stages of SDLC. Here’s my findings of the last 12 months: I Built A Team of AI Agents To Perform Business Analysis https://bettersoftware.uk/2026/01/17/i-built-a-team-of-ai-ag... | ||