Remix.run Logo
Neywiny 5 hours ago

If you're trying to get reliability and determinism out of the LLM, you've already lost

tekne 4 hours ago | parent | next [-]

Wait... why?

Making an unreliable, nondeterministic system give reliable results for a bounded task with well-understood parameters is... like half of engineering, no?

There's a huge difference between "generate this code here's a vague feature description" and "here's a list of criteria, assign this input to one of these buckets" -- the latter is obviously subject to prompt engineering, hallucination, etc -- but so can a human pipeline!

JCTheDenthog 3 hours ago | parent | next [-]

>the latter is obviously subject to prompt engineering, hallucination, etc -- but so can a human pipeline!

...which is why we write deterministic code to take the human out of the pipeline. One of the early uses of computers was calculating firing tables for artillery, to replace teams of humans that were doing the calculations by hand (and usually with multiple humans performing each calculation to catch errors). If early computers had a 99% chance of hallucinating the wrong answer to an artillery firing table, the response from the governments and militaries that used them would not be to keep using computers to calculate them. It would be to go back to having humans do it with lots of manual verification steps and duplicated work to be sure of the results.

If you're trying to make LLMs (a vague simulacrum of humans) with their inherent and unsolvable[1] hallucination problems replace deterministic systems, people are going to eventually decide to return to the tried and true deterministic systems.

1: https://arxiv.org/abs/2401.11817

Neywiny 3 hours ago | parent | prev [-]

Because it's not possible. There is nothing you can say to the LLM that will guarantee that something happens. It's not how it works. It will maybe be taken into consideration if you're lucky.

But if you're trying to tell me that every time you list criteria you get them all perfectly matched, you're clearly gifted.

evantbyrne 2 hours ago | parent | prev | next [-]

I would hope that when engineers speak of LLM determinism they just mean it as shorthand for close to 1 under expected conditions

aleksiy123 4 hours ago | parent | prev | next [-]

There’s a whole range between completely random and completely rule based deterministic.

Somewhere in between that I guess is the varying levels of intelligence more likely able to make the “right” decision for anything you throw at it.

sudosteph 2 hours ago | parent | prev | next [-]

I mean, with reliability there's a spectrum. If the risks that an unreliable outcome brings aren't all that bad, then sometimes it's worth it to chase "my agents made an acceptable PR 70% of the time, can I get it to 90?"

Determinism is a different matter. Scripts and hooks are really the main levers you can pull there, but yeah - a a decent script and a cron job will handle certain things much better (and for a fraction of the cost)

pydry 4 hours ago | parent | prev [-]

This is something I think some people are fundamentally not capable of understanding.