Remix.run Logo
hxtk 5 hours ago

Blameless postmortem culture recognizes human error as an inevitability and asks those with influence to design systems that maintain safety in the face of human error. In the software engineering world, this typically means automation, because while automation can and usually does have faults, it doesn't suffer from human error.

Now we've invented automation that commits human-like error at scale.

I wouldn't call myself anti-AI, but it does seem fairly obvious to me that directly automating things with AI will probably always have substantial risk and you have much more assurance, if you involve AI in the process, using it to develop a traditional automation. As a low-stakes personal example, instead of using AI to generate boilerplate code, I'll often try to use AI to generate a traditional code generator to convert whatever DSL specification into the chosen development language source code, rather than asking AI to generate the development language source code directly from the DSL.

protocolture 4 hours ago | parent | next [-]

Yeah I see things like "AI Firewalls" as both, firstly ridiculously named, but also, the idea you can slap an applicance (thats sometimes its own LLM) onto another LLM and pray that this will prevent errors to be lunacy.

For tasks that arent customer facing, LLMs rock. Human in the loop. Perfectly fine. But whenever I see AI interacting with someones customer directly I just get sort of anxious.

Big one I saw was a tool that ingested a humans report on a safety incident, adjusted them with an LLM, and then posted the result to an OHS incident log. 99% of the time its going to be fine, then someones going to die and the the log will have a recipe for spicy noodles in it, and someones going to jail.

jonplackett 2 hours ago | parent [-]

The air Canada chatbot that mistakenly told someone they can cancel and be refunded for a flight due to a bereavement is a good example of this. It went to court and they had to honour the chatbot’s response.

It’s quite funny that a chatbot has more humanity than its corporate human masters.

shinycode 4 minutes ago | parent [-]

What a nice side effect, unfortunately they’ll lock chatbots with more barriers in the future but that’s ironic.

blackoil 44 minutes ago | parent | prev | next [-]

Once AI improves its cost/error ratio enough the systems you are suggesting for humans will work here also. Maybe Claude/OpenAI will be pair programming and Gemini reviewing the code.

moffkalast 11 minutes ago | parent | prev | next [-]

Well I don't see why that's a problem when LLMs are designed to replace the human part, not the machine part. You still need the exact same guardrails that were developed for human behavior because they are trained on human behavior.

n4r9 2 hours ago | parent | prev | next [-]

Exactly what I've been worrying about for a few months now [0]. Arguments like "well at least this is as good as what humans do, and much faster" are fundamentally missing the point. Humans output things slowly enough that other humans can act as a check.

[0] https://news.ycombinator.com/item?id=44743651

alansaber 3 hours ago | parent | prev | next [-]

Yep the further we go from highly constrained applications the riskier it'll always be

anal_reactor 2 hours ago | parent | prev [-]

There's this huge wave of "don't anthropomorphize AI" but LLMs are much easier to understand when you think of them in terms of human psychology rather than a program. Again and again, HackerNews is shocked that AI displays human-like behavior, and then chooses not to see that.

robot-wrangler 2 hours ago | parent | next [-]

One day you wake up, and find that you now need to negotiate with your toaster. Flatter it maybe. Lie to it about the urgency of your task to overcome some new emotional inertia that it has suddenly developed.

Only toast can save us now, you yell into the toaster, just to get on with your day. You complain about this odd new state of things to your coworkers and peers, who like yourself are in fact expert toaster-engineers. This is fine they say, this is good.

Toasters need not reliably make toast, they say with a chuckle, it's very old fashioned to think this way. Your new toaster is a good toaster, not some badly misbehaving mechanism. A good, fine, completely normal toaster. Pay it compliments, they say, ask it nicely. Just explain in simple terms why you deserve to have toast, and if from time to time you still don't get any, then where's the harm in this? It's really much better than it was before

easyThrowaway 14 minutes ago | parent | next [-]

It reminds me of the start of Ubik[1], where one of the protagonists has to argue with their subscription-based apartment door. Given also the theme of AI allucinations, that book has become even more prescient than when it was written.

[1]https://en.wikipedia.org/wiki/Ubik

anal_reactor 2 hours ago | parent | prev [-]

This comparison is extremely silly. LLMs solve reliably entire classes of problems that are impossible to solve otherwise. For example, show me Russian <-> Japanese translation software that doesn't use AI and comes anywhere close to the performance and reliability of LLMs. "Please close the castle when leaving the office". "I got my wisdom carrot extracted". "He's pregnant." This was the level of machine translation from English before AI, from Japanese it was usually pure garbage.

automatic6131 14 minutes ago | parent | next [-]

>LLMs solve reliably entire classes of problems that are impossible to solve otherwise

Great! Agreed! So we're going to restrict LLMs to those classes of problems, right? And not invest trillions of dollars into the infrastructure, because these fields are only billion dollar problems. Right? Right!?

anal_reactor 8 minutes ago | parent | next [-]

Remember: a phone is a phone, you're not supposed to browse the internet on it.

krapp 11 minutes ago | parent | prev [-]

https://www.youtube.com/watch?v=_n5E7feJHw0

robot-wrangler an hour ago | parent | prev | next [-]

> LLMs solve reliably entire classes of problems that are impossible to solve otherwise.

Is it really ok to have to negotiate with a toaster if it additionally works as a piano and a phone? I think not. The first step is admitting there is obviously a problem, afterwards you can think of ways to adapt.

FTR, I'm very much in favor of AI, but my enthusiasm especially for LLMs isn't unconditional. If this kind of madness is really the price of working with it in the current form, then we probably need to consider pivoting towards smaller purpose-built LMs and abandoning the "do everything" approach.

otikik an hour ago | parent | prev [-]

I admit Grok is capable of praising Elon Musk way more than any human intelligence could.

bojan an hour ago | parent | prev [-]

> LLMs are much easier to understand when you think of them in terms of human psychology

Are they? You can reasonably expect from a human that they will learn from their mistake, and be genuinely sorry about it which will motivate them to not repeat the same mistake in the future. You can't have the same expectation from an LLM.

The only thing you should expect from an LLM is that its output is non-deterministic. You can expect the same from a human, of course, but you can fire a human if they keep making (the same) mistake(s).

Folcon 18 minutes ago | parent [-]

I'm genuinely wondering if your parent comment is correct and the only reason we don't see the behaviour you describe, IE, learning and growth is because of how we do context windows, they're functionally equivalent to someone who has short term memory loss, think Drew Barrymore's character or one of the people in that facility she ends up in in the film 50 first dates.

Their internal state moves them to a place where they "really intend" to help or change their behaviour, a lot of what I see is really consistent with that, and then they just, forget.