Remix.run Logo
variety8675 6 hours ago

I've noticed several companies replacing deterministic systems in their support flows with a LLM version that is slower and worse. Many interfaces simply aren't better with AI added

camdenreslink 6 hours ago | parent | next [-]

The real best case scenario is using LLMs to help build deterministic systems. Instead of asking an LLM to do some task that you know will be repeated, instead ask the LLM to build a program (Python script or whatever) to do the task.

alexpotato 3 minutes ago | parent | next [-]

100% this.

I've already commented on other posts that having LLMs build deterministic and testable tools is the real unlock.

Even for things like customer service, a LLM that analyzes customer support transcripts and then updates your call tree to better route people is a huge win.

jacobgold 5 hours ago | parent | prev | next [-]

Making systems fully deterministic ignores the entire purpose of having agents involved.

IMHO the best of both worlds option is agents working with deterministic CLIs. Where the agent does the reasoning (and text generation) but uses CLIs to carry out all of the actions (issuing refunds, unblocking accounts, or whatever).

It's possible to get very reliable and consistent work out of agents when they're using well written prompts with well designed CLIs.

variety8675 5 hours ago | parent | next [-]

Isn't this how we end up with things like: https://www.reuters.com/legal/government/high-profile-meta-a...

jacobgold 4 hours ago | parent | next [-]

Yes: https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/

Although you can certainly do a better-and-worse job of preventing these kinds of issues.

4 hours ago | parent | prev [-]
[deleted]
bethekidyouwant 5 hours ago | parent | prev [-]

How else would anyone do something like issue a refund if not through a programmatic interface?

sqquima 5 hours ago | parent | next [-]

Direct access to the database, and create the "refund program" on the fly. Yes, stuff of nightmares.

jcgrillo 4 hours ago | parent | next [-]

yes... ha ha ha... yes!

bethekidyouwant 5 hours ago | parent | prev [-]

Right thats just head cannon though. Unless of course you believe the lies you read on the Internet.

jacobgold 5 hours ago | parent | prev [-]

At some level everything an agent does is through a "programmatic interface" (tool calls).

Some people might use skill-based scripts, MCPs, or some kind of raw access to a database. My point is that well designed CLIs are the optimal programmatic interface, for many reasons.

bethekidyouwant 5 hours ago | parent [-]

Sorry what other option is there? Is it going to create an API call from scratch every time after reading a page of documentation?

Wait raw access to the database? That’s one of the options for issuing a refund?

4 hours ago | parent | next [-]
[deleted]
cflewis 4 hours ago | parent | prev [-]

Yes, it can do.

At Big Tech Company I Work At the LLM is quite happy to make raw API calls. If it thinks the data is big, then it'll write a Python tool to do it.

The reason crafted backing CLIs are useful is you can guide the LLM towards stuff that is immediately useful rather than hoping the nondetermism can separate the wheat from the chaff.

Take CI: is it interesting to know which tests passed? Maybe, but probably not. What is really interesting is what failed. Instead of having the LLM go out and talk directly to the CI system, write an intermediate CLI that filters out less actionable stuff by default, and have a flag that'll deliver the full dump if necessary.

It's a skill to do this stuff, and it's a lot of hard won experience than something I think is easily teachable. You kind of have to feel out your model and how it "thinks" about solving problems.

And then a new model version comes out and you have to learn it all again!

dakiol 5 hours ago | parent | prev | next [-]

If it's a one-off script/program that doesn't require additional "domain knowledge", sure. But what if you need to give as context your whole backend repository because you need to take into account a few business rules? Why give anthropic/openai access to my "secret sauce" (e.g., company private repos)?

In that case, it's way better to simply write the code yourself.

mhss an hour ago | parent | next [-]

From all possible concerns, "giving access to anthropic/openai" to your "secret sauce" is the least important one for 99% of the companies out there.

No, is not way better to simply write the code yourself. Most of the code is written faster and better with Claude Code or equivalent. Very niche code is better written by hand. Even then, you're probably better off nudging something like Claude Code in the direction you need it to go. There's nothing interesting about writing it yourself unless you're still learning to code (in which case is a learning exercise for you, not only about the outcome).

daishi55 2 hours ago | parent | prev [-]

I promise OpenAI is not going to steal your “secret sauce”

AlienRobot 5 hours ago | parent | prev | next [-]

The best case scenario of LLM is transforming input into output where both are languages and accuracy doesn't matter, e.g. "rewrite this poem in pirate speech."

But that's not worth trillions of dollars...

JCTheDenthog 6 hours ago | parent | prev [-]

Or just write it yourself?

whehhshs 5 hours ago | parent | next [-]

Because typing “code” takes time and significant amounts of it.

We are slowly waking up to the fact, which was always true, that “coding” is just a fanciful preparatory task in order to appease the spirits properly so that we may invoke the spirit of what we are actually after: a live, running process that does useful things. Code is completely useless when separated from that fact.

Typing it is a complete waste of time unless getting up close and personal with it will result in some kind of useful and actionable improvement in you or your understanding. Knowing when it does and when it does not have this property is a skill of its own.

quacked 5 hours ago | parent | next [-]

> Typing it is a complete waste of time unless getting up close and personal with it will result in some kind of useful and actionable improvement in you or your understanding.

I believe this is the general belief about basically every human skill, that if you stop doing the technical fundamentals you get worse at understanding the activity. The question is whether coding is like sailing a square-rigged wooden ship, which became completely useless knowledge after the invention of the steam engine, or if it's like playing an instrument, which while technically unnecessary after the advent of MIDI and other tools, absolutely hurts your ability to arrange, compose and perform if the skill is neglected.

For my money: I think the AI scenario is more like the latter, but "humans are worse at coding" isn't the consequence I see coming. I worry that in ten years we will be awash in software that's impossible to understand. I don't think that's happened in any human industry ever. Someone has always understood how the machines are built, even if they're very remote from the users of the machine.

taybin 4 hours ago | parent [-]

The sci-fi novel A Fire in the Deep starts with describing a Software Archeologist, who digs through millennia of strata of layers of indirection and I think we could end up needing that one day.

saltcured 2 hours ago | parent [-]

Do they end up determining that every weird piece of code they find must have been used for religious or ritualistic purposes?

inigyou 5 hours ago | parent | prev | next [-]

No serious programmer is regularly bottlenecked by typing speed. Even the ones who type slowly.

If you find yourself writing repetitive code you should consider adding a layer of abstraction. If your language isn't powerful enough you can write a code generator.

nik282000 5 hours ago | parent | prev | next [-]

> Typing it is a complete waste of time unless getting up close and personal with it will result in some kind of useful and actionable improvement in you or your understanding.

Like, perhaps, understanding that it is free of security and functionality bugs.

gloosx 2 hours ago | parent | prev | next [-]

This is such a delusional take it's borderline trolling. Code is an expression tool to precisely describe a process that does useful thing. Typing prompts is not too different from writing some very vague code, which is arguably a waste of time by itself.

krona 5 hours ago | parent | prev | next [-]

The typing was never the bottleneck.

satvikpendem 5 hours ago | parent | next [-]

Based on what I'm using AI for these days, seems like it always was.

Philip-J-Fry 4 hours ago | parent [-]

It depends on where you're using AI. If you're working on a project for yourself or in a tiny company. Then sure, writing the code probably was your bottleneck. But at mid to large companies writing code is maybe 50% of the job, and the other 50% is the process around it. All those processes are the bottle neck, no matter how fast you can write the code. And this was a bottleneck I was hitting well before AI.

whehhshs 5 hours ago | parent | prev [-]

Can you type a hundred lines a second? If not, then it is.

Code is obscenely low level.

skydhash 5 hours ago | parent [-]

> Can you type a hundred lines a second? If not, then it is.

No one has ever needed to do that for something that is new. And if it’s not new, you want to do it repeatedly with some guarantee of reliability. Not just in an uncontrolled manner.

That is why we have snippet systems, macros and code generators. And the best with code is to solve problem once and reuse the solution. Which we have done with libraries, frameworks and supporting software.

jcgrillo 4 hours ago | parent | prev | next [-]

> a live, running process that does useful things

That is one of the things code does. It also communicates the developer's thoughts about how that process should work to others. If the latter is neglected, the code becomes very difficult to collaborate on. Very few lines of code that are written are "write once". Mostly they're changed, repeatedly, over time by many people. The live, running process is a very temporary entity by comparison. Yes, it needs to exist and do useful work. No, it is absolutely not the only thing that matters.

wtetzner 4 hours ago | parent | prev [-]

> Typing it is a complete waste of time unless getting up close and personal with it will result in some kind of useful and actionable improvement in you or your understanding.

I would argue that this is nearly always the case. I don't think people really understand programs that they've only read at more than a very superficial level. This is why I tend to make (temporary) small changes, printlns, etc. when exploring a new code base: it aids greatly in understanding how a program actually works.

And it's even worse (in my experience) with LLM generated code, as it tends not to result in particularly understandable code. It is a lot like LLM generated prose: it often looks entirely reasonable at a surface level, but has a of weirdness/incorrectness hidden beneath the surface. But that surface level makes it very hard to avoid glossing over the details when reviewing the code. For this reason, I personally find it's much more effort to carefully review code than it is to write it.

Humans make mistakes all the time, but their code tends to naturally be structured for human understanding (to some degree based on skill/experience) because they themselves needed to understand it to write it.

I think LLMs are very useful tools, but after quite a lot of experience using them, I think it's generally better to use them as a sounding board, or to help you get unstuck or remove points of friction. Using them to write all of your code (at least for me) seems like a net negative.

I also think it's extremely easy to overestimate how much time they save. It feels like they're a productivity boost because it takes less intense focus to implement something. But I've experienced several instances where actually writing the code myself would have been both quicker and have resulted in better code.

All that being said, it can also be really hard to not write all of your code with agents once you get used to it. There's also a kind of slot-machine-like effect where you write a prompt, excited for the result, and when it doesn't quite come out right, you think "ah just one more prompt and it'll be good." It's hard to see when you're actually doing it though.

It's also weird to me how much people think typing is what the LLM is replacing. Typing was never the hard part. It's the translation of the high-level idea into an unambiguous process that's hard. That's also the valuable part, that requires thinking through the edge cases and consequences of decisions, and that just gets glossed over when using an LLM unless you rigorously review what the LLM has done.

At the end of the day there's a real tradeoff to be made, and it's worth being conscious of what's being given up.

dukeyukey 5 hours ago | parent | prev [-]

If you already know what the inputs/outputs are, why should you spend days or weeks of your life typing it out rather than giving it in a well-specified and tested form to an LLM to get it done a hundred times faster?

xigoi 25 minutes ago | parent | next [-]

The behavior of an LLM is not and cannot be “well-specified”.

4 hours ago | parent | prev | next [-]
[deleted]
JCTheDenthog 2 hours ago | parent | prev | next [-]

>rather than giving it in a well-specified and tested form

So, code?

dosisking 5 hours ago | parent | prev | next [-]

Because the LLM version will have countless number of bugs and security holes, which means you will spend weeks or months of your life fixing them.

chasd00 5 hours ago | parent | prev | next [-]

This is a truth that many are having a hard time accepting. Getting shoved into the light so fast is blinding.

skydhash 4 hours ago | parent | prev [-]

Because it’s rarely so black and white. Knowing the inputs and outputs is merely the first steps, you need to think about the transitions too as they have their own costs.

Those costs don’t disappear and it’s truly naive to think they don’t matter. Take security issues, they may arise because what you thinks was the input is merely a subset of the true input range. And the extra possibilities lead to unforeseen behavior.

A lot of programming is about ensuring that the input and the output are the sets defined in the specs. And the rest is that the transition/relation is the right tradeoffs of performance, correctness, and costs.

Philip-J-Fry 5 hours ago | parent | prev | next [-]

I am seeing similar things in just regular tooling and development. Things that can be solved deterministically or what would have been a simple CLI 5 years ago are now an LLM integration.

Instead of using the LLM to create deterministic tools, we are using LLMs to replace them. It's completely backwards and I don't know why people (especially high ranking people in my company at least) seem to think that this is the way forward. No, I don't want a whole CI pipeline that is just LLM prompts. Yes it's very easy, but it's expensive, slow and prone to failure in ways you can't even predict.

Same things like using LLMs for the code review process. What would have been a simple linting rule is now a pass with an LLM rather than using the LLM to create the linting rule, which it is absolutely excellent at creating.

IAmGraydon 5 hours ago | parent [-]

>I am seeing similar things in just regular tooling and development.

Yes, and we're also seeing lots of companies claiming they're using "AI" and it's just deterministic under the hood.

al_borland 6 hours ago | parent | prev | next [-]

My management is pushing for us to come up with ideas on where we can use LLMs in our product. The whole team has been very resistant for this exact reason. Anything we can think of will only make things worse, and we’ve already been told anything above a 1-2% failure rate is unacceptable. If anything we need more structure and standards to hit that, not less.

rueh 5 hours ago | parent | next [-]

I believe that llm’s can be used to re-imagine experiences but it’s definitely not the way people think. The constraint is imagination and thinking about complex trade offs more than anything else. Which is the essence of innovation.

The agent paradigm will eventually give way to experiences that are a hybrid of deterministic and non deterministic and you won’t even know the llm was involved or visible.

pjmlp 5 hours ago | parent | prev | next [-]

We just got dropped into hackatons for having ideas a few weeks ago, AI at all costs, similar feeling.

iwontberude 5 hours ago | parent | prev [-]

Luckily for programmatic or logic following, smaller models tend to do better, it can be surprising at first to see the more expensive models do worse at a task but it’s true.

gnuvince 5 hours ago | parent | prev | next [-]

Basically, folks nowadays think that this article[1] was aspirational rather than a cautionary tale.

[1] https://thedailywtf.com/articles/Classic-WTF-No-Quack

sdesol 4 hours ago | parent | prev | next [-]

> replacing deterministic systems in their support flows

The issue is, they don't want to provide "better" support but "cheaper" support. Imagine a trained agent that understands the big picture. Now imagine a company investing in humans to use AI to retrieve knowledge that the human can easily identify as being relevant or not, and using that knowledge to better aid the customer.

Right now AI is being sold as a "we don't need support personells" instead of "how can we provide better service." For a lot of products, better service will probably not matter as "cheaper" products will win most of the time.

Most people don't want to pay for better. They want to pay the same for something better, which is what companies are not investing their time in figuring out how to use AI properly for I think.

Hendrikto 4 hours ago | parent [-]

> Most people don't want to pay for better.

A lot of people want to pay for better, but that is hard. Better is more expensive, most of the time, but being more expensive is no guarantee for being better. It feels like the correlation is very weak. Most expensive products are just expensive, not good.

If there was a reliable way to identify the "better" thing, I and a lot of other people would go for that every time we can.

thatjoeoverthr 3 hours ago | parent | prev | next [-]

Unwise design. “It talks, CSRs talk, it’s the same thing”. The fact CSRs talk is incidental. Nobody contacts support to talk. Customer service is a kind of “exception handler” for that which you failed to automate. If your system exists, works and is legible, conversation is avoided.

filup 5 hours ago | parent | prev | next [-]

That's the completely opposite of what people should do. The laborious task of programing logical work flows is the only reason AI is useful for me.

reg_dunlop 5 hours ago | parent [-]

When I hear about engineers who are bored with coding, I have to imagine it's because the task of "programming logical work flows" has become rote to them.

Instead of refining their approach, or challenging their current knowledge base for discovery of inefficiencies or baseless assumptions, they'd rather hit an "easy" button.

I understand the desire to NOT do work. I understand the desire to spend quality time and free time with family. And I understand the idea that familiarity breeds contempt.

What I don't understand is the willingness to replace a deterministic language/framework/approach with a probabilistic slop machine.

romanovcode 6 hours ago | parent | prev | next [-]

As a contractor who built a lot of predictive systems and workflows in last three years I can tell you that quite often there is a specific request to put AI into it even when it is not needed and would objectively make the system worse, slower and more expensive.

The AI psychosis is a real thing.

KellyCriterion 5 hours ago | parent | next [-]

Haha, i have a colleague, he is the "AI-is-for-everything-let-me-check-Claude-first":

Regardless which task is handed to him, he "discusses" it first with Claude and very often comes back with like "The AI said... X"

Philip-J-Fry 4 hours ago | parent | next [-]

I have one too. He'll say "Claude says this:" and pastes a screenshot of some Claude Code output. Most of the time it's wrong, or makes assumptions that won't hold true. Or it comes up with some overcomplicated solution and I'm like "This is like a 10 line change, right here".

These people just destroy their ability to read and understand the systems they're working with. I actually see it as them making themselves redundant. Because if you can't understand anything without Claude, and Claude doesn't even give the right answers, then what are you worth?

KellyCriterion 2 hours ago | parent [-]

...and now think about Claude being shutdown by the gov... :-D

anal_reactor 5 hours ago | parent | prev [-]

I talk to Claude because I'm very talkative but I have nobody to talk to.

pjmlp 5 hours ago | parent | prev | next [-]

I keep seeing requests to replace what would be a perfect UNIX shell script with agents, like what is the benefit other than being able to say we're doing AI?

cwnyth 6 hours ago | parent | prev | next [-]

Where I work, management hasn't considered integrating AI at all, yet some clients are very vocal about it being the future and worry we are going to be left behind. Most people just don't care, and I worry the squeaky wheel will eventually get the grease.

nutjob2 5 hours ago | parent [-]

> worry we are going to be left behind.

I bet lemmings are grateful they were left behind.

It beggars belief that people think that they should rush in some uncertain direction, like some drawbridge is going to be lifted the moment people work out what the right direction is. It's utter stupidity.

inigyou 5 hours ago | parent [-]

Every single person who bootstrapped becoming powerful did it by rushing into things, but it's a high variance strategy because you could also end up destitute

QuantumNomad_ 6 hours ago | parent | prev | next [-]

So then, do you put AI into it anyway because they asked for it, or do you tell them that you won’t do that?

romanovcode 5 hours ago | parent [-]

> you tell them that you won’t do that?

Of course I will do that, I get paid for doing that.

Most of the times I can convince that AI is not necessary by showing small PoC flow with AWS diagrams of data flows. This works well especially if the ask comes from technical people.

Other times the C-level interjects (CEO, CFO, sometimes even CTO) and demands that AI should be there. I literally had CEOs send me instagram reels of some AI shovel-sellers to demonstrate that I am wrong and AI is the way to go. No point arguing after that because I have no problem implementing whatever AI they want rather than losing a paying project.

nik282000 3 hours ago | parent [-]

Use 'AI' to create anti-AI reels showing how much they suck at all tasks. Spam CEO's underlings.

ezst 4 hours ago | parent | prev [-]

Maybe it should have clicked earlier in life and I'm perhaps that much dumb dumb, but it only recently occurred to me (from experiencing it at two very different companies and discussing with peers having reached a certain seniority level more or less at the same time) how dysfunctional many companies are, and how often they produce incentives that are misaligned with the overall company goals and sustainability principles. I blame in large part a layer of middle management that selfishly puts itself above all else, misguides, misrepresents, because it essentially pays larger dividends (literally and not) to "play the networking game than to be an efficient and effective productive structure". Maybe that's to be expected in a services-driven economy where the value of the work is immaterial and subjective (and the whole phenomenon of bullshit jobs).

throwatdem12311 5 hours ago | parent | prev | next [-]

Yeah but did number go up? Can CEO check a box to show investors?

Now that’s real value.

gedy 5 hours ago | parent | prev | next [-]

With inexperienced or non-technical people, talking to them about AI can be very confusing, as a LOT of their "AI" usecases are basically they didn't realize or know how to write a program for this straightforward logic.

mikert89 5 hours ago | parent | prev [-]

models will get smarter, this wont be an issue

reg_dunlop 5 hours ago | parent | next [-]

Intelligence, which I assume to be a synonym for "smart" requires the capacity to acquire and apply knowledge from experience.

These models do not have any experience. They're not sentient. And are in no way capable of being "smart", let alone becoming "smarter".

alexjplant 3 hours ago | parent | prev | next [-]

The Claude web UI popped a modal up a few days ago advertising their new model to me. It was full of HTML tags that were escaped or otherwise not rendered so that the text was literally

  <b>Included in your plan limits until Jun 22</b> <br><br>Fable takes 2x the usage of Opus.
  <b> Switch models when a message is flagged</b><k <br> When safety measures flag a message, automatically switch to a different model to keep chatting. When off, your chat will pause instead. <a href="https://support.claude.com/en/articles/153636
  target="_blank" rel="noopener noreferrer" > Learn more</a>
...and this was presumably generated with the flagship model from the world's most prestigious LLM company.
throwatdem12311 5 hours ago | parent | prev | next [-]

They say this every time. Just wait for the NEXT model bro THEN everything will be be fixed.

Ok wait maybe not the next one but surely the one after!

Hasn’t happened yet and there is no evidence it will.

laichzeit0 3 hours ago | parent [-]

This all reminds me of in the 90s when the Borland C++ compilers and Turbo Pascal shit came out and everyone was still hand rolling assembler because the optimising compilers were so bad. I thought Opus 4.6 was pretty good, basically a step change. The stuff I got out of Fable before they blocked it was nearly alien. If things keep improving, I don’t see humans writing code in 2 to 3 years except maybe super niche areas. This will all go the way that optimising compilers did. No amount of resistance, anger or denialism will change that.

I’d actually love it if LLMs could skip the slow high level lanaguages entirely and just churned out some weird LLM bytecode that was closer to the metal. I don’t want to read it or understand it at all. Here’s my spec, build it and notify when done. I want to ship stuff not build or dick around with code. Basically like when I go to a shop because I want a table, I don’t care if some carpenter “crafted” it or a machine mass produced and spat it out. It’s cute, but most people just want stuff and don’t care how it’s built.

demorro 2 hours ago | parent | next [-]

In every case up until now, a jump in abstraction has moved us forwards in ease of understanding the underlying artifact at a conceptual, human level. High level languages are effectively runnable documentation.

It's possible to say that LLMs producing code may be the same category of thing, but the non-determinism and ephemerality of it all makes it difficult to imagine.

throwatdem12311 3 hours ago | parent | prev [-]

If my experience having to manually unf*ck “production” slop written by sweatshop tier offshore told to go wild with a Claude subscription is any indication then we are a LONG way off from any of that BS.

My job is safe because I’m the only person at the company that actually understands what the actual code is doing and I’m the one that gets the calls at 2am and weekends.

“Weird LLM bytecode”

Why not just generate object code for the target mschine directly?

sfn42 5 hours ago | parent | prev [-]

Come talk to me when it isn't an issue.