What you actually need in most business cases is a 100% auditable, explainable and deterministic workflow. While AI is strictly deterministic - it is technically chaotic. Introducing this in large customer pipelines or in intensive data applications means that even if the AI only does something a bit off 99%, 99.9% or 99.99% you will see large spurious error rates in your workflow. Worst of all these will be difficult to explain - or maybe even purposely hidden, as I have seen some agents attempt to do.

▲

IanCal 3 days ago | parent | next [-]

You absolutely don’t need this. We know this to be true as we use humans and they are none of these things (at 100%) and we use other ml systems that don’t hit all there. Directionally those things are beneficial but you just need the benefits to outweigh the costs.

▲

aprilthird2021 3 days ago | parent | next [-]

> 100% auditable, explainable and deterministic workflow.

Not 100% deterministic workers but workflow. The auditability and explainability of your system becomes difficult with AI and LLMs in between because you don't know at what point in the reasoning things turned wrong.

You need, for a lot of things, to know at every step of the way who is culpable and what part of the work they were doing and why it went wrong and how

▲

kakacik 3 days ago | parent | prev | next [-]

Depends on the industry, clearly you never worked in such. Regulated (medical, transport, municipality, state, army and so on) or just with some decent enforced regulations like whole finance for example, and bam! you have serious regulatory issues that every single sane business tries desperately to stay away from.

	▲	3 days ago \| parent \| next [-]
		[deleted]
	▲	IanCal 3 days ago \| parent \| prev \| next [-]
		“There are business problems” and “most business problems” are not the same thing.
	▲	foobarian 3 days ago \| parent \| prev [-]
		> you have serious regulatory issues ... until people decide they are OK with things being less than 100% and relax the regulations. Helped along by the purveyors of the AI tools no doubt

▲

gizajob 3 days ago | parent | prev | next [-]

The difference is that although humans aren’t 100% accurate, they are responsible for their work.

▲

dwohnitmok 3 days ago | parent [-]

This has been going down over time.

A lot of the software industry has been moving away from assigning humans individual responsibility for failure (e.g. blameless post mortems).

	▲	Yoric 2 days ago \| parent [-]
		I suspect that it's only a small corner of the software industry, which is itself only a small corner of industry. I further suspect that most actors will still want someone responsible to take the blame when an incident takes place. Even if they have to make one up.

▲

bandrami 3 days ago | parent | prev [-]

Yeah no. I make software used on actual flight simulators and we literally need it to be deterministic, to the extent of needing the same help query to always return the exact same results for all users at all times.

▲

IanCal 3 days ago | parent | next [-]

Some business problems need that. That’s not the same as asserting most do and it’s certainly not the same all business problems.

Some things need to be deterministic. Many don’t.

Even your business will have many such problems that don’t need 100% all those properties - every task performed by a human for example. You as a developer are not all of these things 100%!

And your help query may need to be deterministic but does it need to be explainable? Many ml solutions aren’t really explainable, certainly not to 100% whatever that may mean, but can easily be deterministic.

▲

charcircuit 3 days ago | parent | prev [-]

If you were on a real flight and asked a human for help, they wouldn't give a deterministic answer. This doesn't seem like an actual requirement that is needed, but rather something that is post hoc rationalized because it was cheaper to make that way. While terms like consistency may come up when referring to having deterministic output as a requirement, the true reason could actually just be cost.

▲

throwup238 3 days ago | parent | next [-]

> If you were on a real flight and asked a human for help, they wouldn't give a deterministic answer.

If you were on a real flight, asking a qualified human - like a trained pilot - would result in a very deterministic checklist.

Deterministic responses to emergencies is at least half of the training from the time we get a PPL.

▲

hi_hi 3 days ago | parent | prev | next [-]

Regulated industries (amongst many) need to be deterministic. Imagine your bank being non-deterministic.

▲

charcircuit 3 days ago | parent [-]

>Imagine your bank being non-deterministic.

That's already the case. Payments are not deterministic. It can take multiple days for things to settle. The real world is messy.

When I make a payment I have no clue if the money is actually going to make it to a merchant or if some fraud system will block it.

	▲	hi_hi 3 days ago \| parent \| next [-]
		The bank can very much determine if the payment has been made or not (although not immediately, as you mentioned). As a rule, banks like to keep track of money.
	▲	soco 3 days ago \| parent \| prev \| next [-]
		Yes it settles deterministically. With AI it claims to be settled and goes on, and it's up to you to figure it out how deterministic the whole transaction actually was.
	▲	Yoric 2 days ago \| parent \| prev [-]
		Is it the main issue? Payments suffer from race conditions, but the processes themselves are deterministic, auditable and may be rolled back. Not sure how many of these important attributes would remain with a neural network at the helm.

▲

IanCal 3 days ago | parent | prev [-]

Even then it can be deterministic but not explainable. Tfidf is fairly explainable but about the limit imo for full explanations making sense such that you can fully reason about them and predict outcomes and issues accurately. Embeddings could give better, fully deterministic results but I wouldn’t say they’re 100% explainable.

▲

thisisit 3 days ago | parent | prev | next [-]

Just couple of hours ago I was discussing this with a Principal Architect. He is responsible for all the finance workflows. We had just come out of product demo where the vendor showed workflows which were 100% auditable, explainable and deterministic. It required human in the loop to double check AI's work.

The feedback from the architect was that the vendor was way too cautious in using AI. Nearly all vendors he has seen so far were too cautious. He lamented that no one was fully unleashing AI. They could achieve that by allowing read/write access to confidential data like ERP/CRMs and access to internet while being fully non-deterministic. Then AI could achieve lot more.

I explained that AI being right 95% of the time is still not good enough for finance workflows but he wouldn't budge. He kept repeating that non-deterministic and remove human in the loop is the way to go. I silently promised myself to stay away from any AI projects he might be part of.

	▲	bilekas 3 days ago \| parent \| next [-]
		>He kept repeating that non-deterministic and remove human in the loop is the way to go. For an "Architect" this is extremely troubling.. > He lamented that no one was fully unleashing AI More than likely he will never be the one cleaning up the mess, probably he will be the one contracted to design proper systems though so maybe it's a genius move.
	▲	yomismoaqui 3 days ago \| parent \| prev \| next [-]
		Just suggest to him to implement or supervise the creation of a system like that ON HIS RESPONSIBILITY. That is, if the system fails and loses company/client money he has to pay it from his own account. Then tell us what how he sees that 5% error rate.
	▲	c048 3 days ago \| parent \| prev \| next [-]
		I worked in a finance department for over a decade. That architect is a lunatic or a sheer idiot.
	▲	suncemoje 3 days ago \| parent \| prev \| next [-]
		I was recently approached by a lawyer who wants to automate legal workflows. “Intriguing” I thought, given the advancements of LLMs / agentic AI + the huge funding rounds I keep seeing in LegalTech. I eventually had to give the project a pass because I didn’t believe I would be able to get AI to consistently produce accurate outputs, EVEN IF the inputs stayed the same. Couldn’t imagine building a deterministic system that scales in the legal domain…
	▲	Yoric 2 days ago \| parent \| prev [-]
		So it's ok if 5% of the time, his paycheck is sent to someone else?

▲

Joel_Mckay 3 days ago | parent | prev | next [-]

The people outside the business of selling hype will not be keen on paying to break their business with popular liabilities. =3

https://www.youtube.com/watch?v=_zfN9wnPvU0

▲

enraged_camel 3 days ago | parent | prev | next [-]

This comment is a bit strange.

>> While AI is strictly deterministic - it is technically chaotic

AI is neither deterministic nor chaotic. It is nondeterministic because it works based on probability, which means that for open-ended contexts it can be unpredictable. But properly engineered agentic AI workflows can drastically reduce and even completely eliminate the unpredictability. Having proper guardrails such as well-defined prompts, validations and fallbacks in place can help ensure mistakes made by AIs don't result in errors in your system.

▲

throw-qqqqq 3 days ago | parent | next [-]

> AI is neither deterministic nor chaotic. It is nondeterministic because it works based on probability

A deterministic function/algorithm always gives the same output given the same input.

LLMs are deterministic if you control all parameters, including the “temperature” and random “seed”. Same input (and params) -> same output.

▲

mejutoco 3 days ago | parent | next [-]

I thought this too, but it seems that is not the case. I could not remember the reason I saw why so I googled it (AI excerpt).

Large Language Models (LLMs) are not perfectly deterministic even with temperature set to zero , due to factors like dynamic batching, floating-point variations, and internal model implementation details. While temperature zero makes the model choose the most probable token at each step, which is a greedy, "deterministic" strategy, these other technical factors introduce subtle, non-deterministic variations in the output

▲

Calavar 3 days ago | parent [-]

You were probably thinking about this piece on nondeterminism in attention by Thinking Machines: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...

	▲	andai 3 days ago \| parent [-]
		If I understood correctly the reason for this is that some floating point operations are not commutative?

▲

District5524 3 days ago | parent | prev | next [-]

Not that it's incorrect but there is some data showing variability even with the very same input and all parameters. Especially if we have no control over the model behind the API with engineering optimizations etc. See Berk Atil et al.: Non-Determinism of "Deterministic" LLM Settings, https://arxiv.org/abs/2408.04667v5

▲

viccis 3 days ago | parent | prev | next [-]

Ignoring that you are making an assumption about how the randomness is handled, this is a very vacuous definition of "deterministic" in the context of the discussion here, which is AI controlling large and complex systems. The fact that each inference can be repeated if and only if you know and control the seed and it is implemented with a simple PRNG is much less important to the conversation than its high level behavior, which is nondeterministic in this application.

If your system is only deterministic if it processes its huge web of interconnected agentic prompts in exactly the same order, then its behavior is not deterministic in any sense that could ever be important in the context of predictable and repeatable system behavior. If I ask you whether it will handle the same task the same exact way, and its handling of it involves lots of concurrent calls that are never guaranteed to be ordered the same way, then you can't answer "yes".

▲

mbesto 3 days ago | parent | prev | next [-]

> LLMs are deterministic if you control all parameters, including the “temperature” and random “seed”.

This is not true. Even my LLM told me this isn't true: https://www.perplexity.ai/search/are-llms-deterministic-if-y...

▲

cnnlives78 3 days ago | parent | prev [-]

The LLMs most of us are using have some element of randomness to every token selected, which is non-deterministic. You can try to attempt to corral that, but statistically, with enough iteration, it may provide nonsense, unintentional, dangerous, opposite solutions/answers/action, even if you have system instructions defining otherwise and a series of LLMs checking themselves. Be sure that you fully understand this. Even if you could make it fully deterministic, it would be deterministic based on the model and state, and you’ll surely be updating those. It amazes me how little people know about what they’re using.

▲

CjHuber 3 days ago | parent | prev | next [-]

Are they? I mean I wouldn't say they are strictly deterministic, but with a temperature and topk of 0 and topp of 1 you can at least get them to be deterministic if I'm correct. In my experience if you need a higher temp than 0 in a prompt that is supposed to be within a pipeline, you need to optimize your prompt rather than introduce non determinism. Still of course that doesn't mean some inputs won't give unexpected outputs.

▲

flufluflufluffy 3 days ago | parent | next [-]

In the hard, logically rigorous sense of the word, yes they are deterministic. Computers are deterministic machines. Everything that runs on a computer is deterministic. If that wasn’t the case, computers wouldn’t work. Of course I am considering the idealized version of a computer that is immune to environmental disturbances (a stray cosmic ray striking just the right spot and flipping a bit, somebody yanking out a RAM card, etc etc).

LLMs are computation, they are very complex, but they are deterministic. If you run one on the same device, in the same state, with exactly the same input parameters multiple times, you will always get the same result. This is the case for every possible program. Most of the time, we don’t run them with exactly the same input parameters, or we run them on different devices, or some part of the state of the system has changed between runs, which could all potentially result in a different outcome (which, incidentally, is also the case for every possible program).

▲

blibble 3 days ago | parent [-]

> Computers are deterministic machines. Everything that runs on a computer is deterministic. If that wasn’t the case, computers wouldn’t work.

GPU operations on floating point are generally not deterministic and are subject to the whims of the scheduler

	▲	flufluflufluffy 2 days ago \| parent [-]
		If the state of the system is the same, the scheduler will execute the same way. Usually, the state of the system is different between runs. But yeah that’s why I qualified it with the hard, logically rigorous sense of the word.

▲

blibble 3 days ago | parent | prev [-]

> Are they? I mean I wouldn't say they are strictly deterministic, but with a temperature and topk of 0 and topp of 1 you can at least get them to be deterministic if I'm correct.

the mathematics might be

but not on a GPU, because floating point numbers are an approximation, and their operations are not commutative

if the GPUs internal scheduler reorders the operations you will get a different outcome

remember GPUs were designed to render quake, where drawing pixels slightly off is imperceptible

▲

jampekka 3 days ago | parent | prev | next [-]

I wouldn't be surprised if autoregressive LLMs had some chaotic attractors if you stretch the concept to finite discrete state (tokens).

▲

Incipient 3 days ago | parent | prev [-]

Can you share some examples of eliminating non-determinism? I feel like I should be able to integrate agents into various business systems, but this issue is a blocker.

Eg. An auto email parser that extracts an "action" - I just don't trust that the action will be accurate and precise enough to execute without rereading the email (hence defeating the purpose of the agent)

	▲	_joel 3 days ago \| parent [-]
		I'm not sure it eliminates but reducing the temperature and top-k/p?

▲

belter 3 days ago | parent | prev | next [-]

> Introducing this in large customer pipelines or in intensive data applications means that even if the AI only does something a bit off 99%, 99.9% or 99.99% you will see large spurious error rates in your workflow.

You just described how you get your google account locked... :-)

▲

flir 3 days ago | parent | prev [-]

How can the agent hide the error?

You log the interaction, you see what happened, no?

▲

wongarsu 3 days ago | parent [-]

In coding agents that would be "the test keeps failing and I can't fix it - let's delete the test" or "I can't fix this bug, let's delete the feature"

If you measure success by unit test failures or by the presence of the bug those behaviors can obscure that the LLM wasn't able to do the intended fix. Of course a closer inspection will still reveal what happened, but using proxy measurements to track success is dangerous, especially if the LLM knows about them or if the task description implies improving that metric "a unit test is failing, fix that"

▲

flir 3 days ago | parent [-]

Sure, but the discussion here is around "in production"? I'm trying to imagine a scenario and I'm coming up short.

▲

sebastiennight 3 days ago | parent [-]

In GP's comment, the coding agent is deployed "in production" since you (the developer) and/or your company are paying for it to use it in your business.

	▲	flir 3 days ago \| parent [-]
		"Introducing this in large customer pipelines or in intensive data applications" shrug To be honest, I don't think I'm going to get an answer.