The real issue is expecting an LLM to be deterministic when it's not.

Language models are deterministic unless you add random input. Most inference tools add random input (the seed value) because it makes for a more interesting user experience, but that is not a fundamental property of LLMs. I suspect determinism is not the issue you mean to highlight.

▲

dTal 4 hours ago | parent | next [-]

Sort of. They are deterministic in the same way that flipping a coin is deterministic - predictable in principle, in practice too chaotic. Yes, you get the same predicted token every time for a given context. But why that token and not a different one? Too many factors to reliably abstract.

	▲	orbital-decay an hour ago \| parent \| next [-]
		>Yes, you get the same predicted token every time for a given context. But why that token and not a different one? Too many factors to reliably abstract. Fixed input-to-output mapping is determinism. Prompt instability is not determinism by any definition of this word. Too many people confuse the two for some reason. Also, determinism is a pretty niche thing that is only necessary for reproducibility, and prompt instability/unpredictability is irrelevant for practical usage, for the same reason as in humans - if the model or human misunderstands the input, you keep correcting the result until it's right by your criteria. You never need to reroll the result, so you never see the stochastic side of the LLMs.
	▲	ryandrake an hour ago \| parent \| prev \| next [-]
		It always feels like I just have to figure out and type the correct magical incantation, and that will finally make LLMs behave deterministically. Like, I have to get the right combination of IMPORTANT, ALWAYS, DON'T DEVIATE, CAREFUL, THOROUGH and suddenly this thing will behave like an actual computer program and not a distracted intern.
	▲	WithinReason 3 hours ago \| parent \| prev [-]
		Like the brain

▲

usernametaken29 4 hours ago | parent | prev [-]

Actually at a hardware level floating point operations are not associative. So even with temperature of 0 you’re not mathematically guaranteed the same response. Hence, not deterministic.

▲

adrian_b 3 hours ago | parent [-]

You are right that as commonly implemented, the evaluation of an LLM may be non deterministic even when explicit randomization is eliminated, due to various race conditions in a concurrent evaluation.

However, if you evaluate carefully the LLM core function, i.e. in a fixed order, you will obtain perfectly deterministic results (except on some consumer GPUs, where, due to memory overclocking, memory errors are frequent, which causes slightly erroneous results with non-deterministic errors).

So if you want deterministic LLM results, you must audit the programs that you are using and eliminate the causes of non-determinism, and you must use good hardware.

This may require some work, but it can be done, similarly to the work that must be done if you want to deterministically build a software package, instead of obtaining different executable files at each recompilation from the same sources.

▲

pixl97 an hour ago | parent | next [-]

If you want a deterministic LLM, just build 'Plain old software'.

▲

KeplerBoy 3 hours ago | parent | prev | next [-]

It's not even hard, just slow. You could do that on a single cheap server (compared to a rack full of GPUs). Run a CPU llm inference engine and limit it to a single thread.

▲

usernametaken29 3 hours ago | parent | prev [-]

Only that one is built to be deterministic and one is built to be probabilistic. Sure, you can technically force determinism but it is going to be very hard. Even just making sure your GPU is indeed doing what it should be doing is going to be hard. Much like debugging a CPU, but again, one is built for determinism and one is built for concurrency.

	▲	wat10000 2 hours ago \| parent [-]
		GPUs are deterministic. It's not that hard to ensure determinism when running the exact same program every time. Floating point isn't magic: execute the same sequence of instructions on the same values and you'll get the same output. The issue is that you're typically not executing the same sequence of instructions every time because it's more efficient run different sequences depending on load. This is a good overview of why LLMs are nondeterministic in practice: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...

▲

WithinReason 4 hours ago | parent | prev | next [-]

Oh how I wish people understood the word "deterministic"

▲

curt15 3 hours ago | parent | prev | next [-]

LLMs are deterministic in the sense that a fixed linear regression model is deterministic. Like linear regression, however, they do however encode a statistical model of whatever they're trying to describe -- natural language for LLMs.

▲

timcobb 4 hours ago | parent | prev | next [-]

they are deterministic, open a dev console and run the same prompt two times w/ temperature = 0

	▲	pixl97 an hour ago \| parent \| next [-]
		And then the 3rd time it shows up differently leaving you puzzled on why that happened. The deterministic has a lot of 'terms and conditions' apply depending on how it's executing on the underlying hardware.
	▲	datsci_est_2015 3 hours ago \| parent \| prev [-]
		So why don’t we all use LLMs with temperature 0? If we separate models (incl. parameters) into two classes, c1: temp=0, c2: temp>0, why is c2 so widely used vs c1? The nondeterminism must be viewed as a feature more than an anti-feature, making your point about temperature irrelevant (and pedantic) in practice.

▲

baq 4 hours ago | parent | prev [-]

LLMs are essentially pure functions.