Is it possible to have an LLM always give a consistent output?

Is there any LLM who's focus is this? EG:

1. Given the same exact tokens, 2. Always return the same tokens.

EG: I give the LLM the following tokens:

"What color is the sky?"

it always would return a message like:

"It's Purple!"

and never change. I understand this not a normal usecase or in many cases desired for an LLM, but I am curious if anyone is working on this in an academic paper, or if there is a private organization which is doing something along these lines.

▲

yorwba 12 hours ago | parent | next [-]

Deterministic output is a property of the inference framework, not the model. E.g.

https://docs.sglang.ai/advanced_features/deterministic_infer...

https://github.com/ggml-org/llama.cpp/pull/16016

	▲	twosdai 12 hours ago \| parent [-]
		Really appreciate this, this is exactly what I was looking for.

▲

vunderba 12 hours ago | parent | prev | next [-]

Do you mean:

Given a user input X/Y/Z that "resolves" to the target information of "sky color", respond with "Purple"?

That sounds to me more like sticking a classifier and/or vector similarity database interceptor in front of an LLM and pre-empt with cached response.

Otherwise I'm not sure I understand the question. If you just want EXACT TOKEN INPUT => EXACT TOKEN OUTPUT then it's just a KVP as @danenania mentioned.

▲

twosdai 13 hours ago | parent | prev | next [-]

To clarify I am valuing consistency in the output to the token level, not the actual information of the content.

▲

compressedgas 12 hours ago | parent | prev | next [-]

Use a fixed sampler seed.

▲

danenania 12 hours ago | parent | prev | next [-]

If you need this, could you just put your own kv cache in front?

▲

cratermoon 12 hours ago | parent | prev [-]

maybe, if you set the temperature to 0, but by nature the math is stochastic, not deterministic.