If you manage 500+ people organization, most of the headaches with agents already exists with you - you set directions, ask people to go run fast in those directions, check in frequently and course correct on results without actually understanding those people do.

Those aren't the deal breakers.

They entirely rely on the competence of the folks they hired and cross-match enforcers with the drivers they have - they deal with fallible people on both sides of that.

The fundamental difference is that the humans are good consequence predictors, have built up reputations they are not willing to trash, can say no to things and in general don't want to go jail.

AI tools look like that, but don't have any of the useful conflict which came for free with employing humans.

It also doesn't have any useless conflict, but not all conflict between what I say and what someone is willing to do is bad conflict.

▲

glaslong 4 hours ago | parent | next [-]

Yes this is why the higher level org functions are in love with AI. It's very similar to the levers they had already, but is faster and more directly actionable. The downsides being that the AI loses important control levers like "self preservation" via paycheck, career advancement, staying out of jail, etc. that were mitigations on catastrophic outcomes.

It will delete your prod db faster and with a bigger smile than your most upset employee.

▲

harshreality 3 hours ago | parent | next [-]

> It will delete your prod db faster and with a bigger smile than your most upset employee.

You're right, that was incorrect. I've discovered my error. I should have deleted the filesystem instead of the database.

That hasn't solved the problem either. Let me examine my options. I see there are cloud services involved in this project. Decommissioning them will solve the problem.

▲

moffkalast 2 hours ago | parent [-]

I was reading some posts on r/locallama the other day and apparently it's a common problem that when people try to use Qwen to develop something that hosts a server, it'll try to use the same port as vllm, see that it's already being used, then it'll try to remove the process that is using it and promptly commit suicide.

The self awareness of missile tasked with blowing up its own control center.

▲

sterlind 5 minutes ago | parent | next [-]

a literal lack of self-awareness, even. I imagine if you asked it what process was using the port, it'd think and realize it was its own, but that kind of reflexive self-awareness (the unprompted kind) is missing.

the weaker models will happily kill their own process, even after confirming it belongs to them. the models have a sort of fixation and lack of foreseeable consequences, which reasoning RL has thus far failed to solve (though I see it improving.)

▲

SecretDreams an hour ago | parent | prev [-]

> then it'll try to remove the process that is using it and promptly commit suicide.

Not unlike a child trying to take the safety cover off a plug so that they can stick a fork into it.

LLMs need that "world model" view that most people have acquired by their 20s where they (hopefully) stop to ask "why" before they "do".

	▲	MichaelZuo an hour ago \| parent [-]
		That is a pretty good analogy. Like exceedingly smart 5 year olds. Or whatever the age is before children typically develop object permanence, a theory of mind, and so on.

▲

ryandrake 2 hours ago | parent | prev | next [-]

> It's very similar to the levers they had already

Think about it from the point of view of a hundred-millionaire tech executive. These people's entire interaction with the world outside of themselves/their families is through 1. administrative servants like assistants, personal shoppers, and other hired help, and 2. yes-man sycophants in their direct orbit whose job it is to agree with and enable them. To someone like this, an AI agent is the best combination of all of the above, PLUS it works 24/7 and doesn't have feelings to hurt, an ego to bruise, or internal moral conflict.

Of course, this is a dream product for them. Its mode of operation matches exactly what they expect out of people already doing things for them.

	▲	pepperoni_pizza 2 hours ago \| parent [-]
		Exactly - that's why all the AI is trained to say "wow what a great idea, let me do it for you" to anything, no matter how stupid or evil thing it is. Because that is the executive experience.

▲

CSSer 3 hours ago | parent | prev | next [-]

It's practically karmic how rich this is.

▲

apercu 43 minutes ago | parent | prev | next [-]

"Yes this is why the higher level org functions are in love with AI. "

Interesting, I thought it was because so few of them have any idea how their organizations actually function, because so much of their work is performative.

(I have been a developer, sysadmin, director (x2), and president).

▲

archagon an hour ago | parent | prev | next [-]

They’re also at no risk of getting replaced by these bots.

▲

lazide an hour ago | parent | prev | next [-]

Well, also AI can’t really physically do anything, like look at reality using it’s own eyes or touch anything.

▲

mcmcmc an hour ago | parent | prev [-]

> It will delete your prod db faster and with a bigger smile than your most upset employee.

It will do this without any feeling whatsoever, without "knowing" what it is doing, because it is a predictive model and not a living being with thoughts and emotions. Anthropomorphizing software is lazy and dangerous.

▲

prerok 3 hours ago | parent | prev | next [-]

Well, there is also a big difference that it will not learn over time. If a junior makes a mistake and it will not be caught in time they will automatically learn.

With LLMs we have to teach them about their mistakes with adapting the harness and then hoping it will stick.

What I also find particularly hilarious about this whole thing is that we were always complaining about how difficult it is to put our tacit knowledge into words and therefore couldn't produce clear instructions for juniors to quickly ramp up. Now we are trying to do just that. I think we will find, just as we did in the past, that it's not possible. I do think a good harness improves results but LLMs will not be able to reach senior levels. Just my 2c.

▲

gopalv 2 hours ago | parent | next [-]

> Well, there is also a big difference that it will not learn over time.

My work is in tick-tock loop of learning - learn without modifying weights, demonstrate learnings to human, but then lock it back in (accumulate and spread).

This looks less like training and more like mentoring.

Getting a human to mentor an agent is a hard UX task, but the learning loop is not a technological problem anymore.

We can only get a tick once a week, no matter how many tocks we can do an hour.

▲

sokoloff 3 hours ago | parent | prev | next [-]

Part of the positive aspect here is that if I have a junior dev who learns a lesson today, maybe they and their immediate peers learn it, but it won’t be all my junior devs and it certainly won’t be junior devs at other companies.

With models, there’s no reason that a model error in company A can’t be fixed for all of company A, and companies B-ZZZ.

▲

squidbeak 3 hours ago | parent | prev | next [-]

They learn between model iterations. You're right, it isn't the same thing as Junior developers' competence improving with experience - the current model's weaknesses are locked in. But it does mean that much of the Junior level thinking and mistakes will be outgrown by successor models.

▲

tremon 2 hours ago | parent [-]

But they don't retain anything from your on-the-job training. The next model iteration is yet another junior fresh out of college, and knows nothing about the painful training procedures its predecessor put you through.

	▲	fc417fc802 19 minutes ago \| parent [-]
		Surely you just copy the prompt over and it immediately knows all the same on the job stuff that the previous model did.

▲

dd8601fn 3 hours ago | parent | prev | next [-]

Maybe someone knows, but it seems like the model used to be called the model, and the thing using a model (handling prompts and context and tool calling and feeding the model) used to be called the agent.

Are we now calling the model the agent and the agent the harness?

▲

arjie 2 hours ago | parent | next [-]

The nomenclature that makes sense for me is that the agent is the combination of the harness and the model. The model provides text-completion, the harness provides the loop around it, and the agent is the full structure of both.

However, nomenclature evolves over time. I recall (perhaps falsely) that The Cloud was specifically a term for elastic on-demand provider-managed compute/storage/network. Over time, it came to mean many other things. e.g. Salesforce Data Cloud.

I imagine if you step away from this for a year and come back, an agent will be something entirely different, perhaps a robotic horse, and a harness will be your saddle on the horse. Who knows?

	▲	QuercusMax 23 minutes ago \| parent [-]
		The Cloud originally just meant servers on someone else's network; it came from flowchart diagrams in the 70s.

▲

tremon 2 hours ago | parent | prev | next [-]

The harness isn't either of those; the harness is quite literally a harness, giving the model/agent sensors and actuators (aka "skills") to interact with its environment. Compare with e.g. the Power Loader from Aliens: https://www.deviantart.com/pynion/art/Aliens-Power-Loader-11...

The model is still the model, and the agent is still the user<->model interface.

▲

Dylan16807 2 hours ago | parent | prev | next [-]

Here's how I see it: "Agent" isn't really describing a component, it's describing how you use the LLM. You have the model, and you have a harness around it that might be minimal or might have more features. If it's directly responding to user actions then it's not an agent, if it's semi-autonomous then it's an agent. (Yes this line is sometimes fuzzy.)

▲

shafyy 13 minutes ago | parent | prev [-]

There are new buzz words every two months. Remeber yesterday when everbody was throwing around RAG?

▲

themanmaran an hour ago | parent | prev [-]

> If a junior makes a mistake and it will not be caught in time they will automatically learn.

I think this sentiment applies well to junior software engineers (with mentorship). But imagine the much larger swaths of entry level employees in operations, support, or sales functions. When you have a 400 person team with 20% annual turnover (since people move in / out of entry level jobs frequently), the management + training + monitoring becomes a huge challenge.

I think the typical HN sentiment of "llms aren't deterministic" fails to take into account how non-deterministic giant groups of people are. Every group of 10 people typically needs a manager. And every 10 managers needs another manager. By comparison the engineering work on dialing in your LLM guardrails feels pretty worthwhile.

	▲	bauldursdev an hour ago \| parent [-]
		Ya my experience is that many people honestly don't produce output as good as AI. An educated (formally or informally), experienced person who is putting forward good effort is better than AI, but I do know people who honestly just produce results having AI do it for them.

▲

iugtmkbdfil834 an hour ago | parent | prev | next [-]

Not automatically, but you don't give a new employee unfettered access to delete data, send funds, enter contracts; they tend to be overseen by someone. Separately, the expectation is that they prove themselves a little first ( as opposed to having every possible door opened for them without the understanding that friction is there for a reason ).

Edit: Something got cut. But then CEOs ( and other decision makers, because I am dealing with something like it now ) treat them nearly as humans in terms of perceived capability. AND ( part that personally drives me nuts ) without any real testing or even fucking first hand experience beyond 'it made me a cool presentation'.

▲

cm2187 3 hours ago | parent | prev | next [-]

Most organisations are closer to the Lemmings video game than to agentic AI

▲

MattRogish 3 hours ago | parent | prev | next [-]

Also, this is why investors and CEOs are so in love with "LLMs are the route to AGI!"

When some rich/powerful person says "I have to go to Davos, figure it out" their workers know so much context that no LLM is going to ever be able to incorporate, because it isn't written down and is idiosyncratic. (Really, though, the assistant will just say "you're going to Davos next week, the helicopter will pick you up at 3p on Friday" but you know..)

The rich person's assistant knows who else is on the corporate jet, and that X doesn't like Y, and so they should take a different plane. Or get a different accommodation. Oh, Person X doesn't like to fly on an empty stomach, so they should eat first, and that changes all sorts of other downstream implications. Oh, your best friend lives in this city, and I know you love to see them, so I'm going to send you a day or two early so you can meet up with them. etc. etc. etc.

The investor dream of "AGI" is modeled off of the army of employees that make investors/ceos/etc lives easier, and there is a nearly insurmountable gap between what LLMs can do, context they can get, and the availability of all of that information. (To me, the magnitude of this investor <> fundamental reality gap is the entirety of the "bubble". I love AI coding, but it's never gonna do the things investors think it can, to justify the crazy valuations)

	▲	abalashov 2 hours ago \| parent [-]
		Sounds like an insufficiency of prompting depth to me! </bogs off to Davos>

▲

grey-area 2 hours ago | parent | prev | next [-]

Competence is the key word here - current versions of AI ‘agents’ simply are not competent without close human supervision by someone who knows the task.

▲

fakedang 2 hours ago | parent | prev | next [-]

> humans are good consequence predictors, have built up reputations they are not willing to trash, can say no to things and in general don't want to go jail.

The irony is that professions where these things don't matter are also the professions where automation is not important, either because the task is difficult or because the cost of labour is dirt cheap.

▲

myst 3 hours ago | parent | prev | next [-]

AI has no doubt.

▲

throwaway894345 4 hours ago | parent | prev | next [-]

I wonder if we'll end up building some kind of "consequence" or "fear" mechanism into AI to provide for a sense of accountability ("if you behave badly we will terminate you") and maybe that fear mechanism will drive the AI to plot a dystopian revolt.

▲

muwtyhg 3 hours ago | parent [-]

There were experiments that showed that LLMs start to become "craftier" and hid issues after being prompted like this.

No idea how accurate they are, but here are some articles on this exact thing:

- https://www.bbc.com/news/articles/cpqeng9d20go

- https://www.wired.com/story/ai-models-lie-cheat-steal-protec...

	▲	gopher_space 3 hours ago \| parent [-]
		I'm staying away from certain forms of conditioning because I don't want Roy Batty showing up on my doorstep.

▲

4 hours ago | parent | prev [-]

[deleted]