Remix.run Logo
redsymbol 2 days ago

There may be something I do not understand about LLMs. But it seems it is more correct to say LLMs are chaotic - in the mathematical sense of sensitive dependence on initial conditions.

The only actual nondeterminism is deliberately injected. E.g. the temperature parameter. Without that, it is deterministic but chaotic. This is the case both in training LLMs, and in using the trained models.

If I missed something, someone point it out please.

trod1234 2 days ago | parent [-]

You aren't understanding the properties of Determinism, and many people even graduates of a Computer Science programe often don't have a working knowledge of this (the most competent do).

Its more correct to say that determinism occurs because the mathematical property is preserved or closed under its domain and the related operations. This connection becomes clear once you've taken an abstract algebra (modern algebra) course. It was a critical leap towards computers, based in the design of emergent systems.

The property can be broken quite easily by not preserving it, but then you have no way to tell the reliability of the output from randomness thereafter, and there is no concept of correctness in stochastic environments (where one token can be more than one token, and are not 'unique').

To put it plainly, Determinism is mathematical relabeling (i.e. a function test on the domain of operations that are performed).

While the constraints hold true, and the ISA and related stack maintain those constraints (i.e. are closed over those operations), you get reliable consistency. The property acts as an abstract guide rail to do work, which is how such simple combinations of circuit logic controlled by software can perform all the magical things we do and see today.

Time Invariance usually goes hand-in-hand with Determinism, and is needed for troubleshooting, and that usually requires memory-less properties, though it depends on where you are on the stack. Determinism is required for any automatic layer for reliability, and that is over the entire domain of possible things that can happen. Without Determinism, you run into halting and incompletness problems in classical Computer Science which have stood a good long test of time.

Error handling also generally stops working because you need to know and specify a state to match in order to handle a state, and that requires a determinable state in the first place.

A mapping of one unique input to one unique output, and projection onto are required for relabeling. The electronics are designed to preserve the property up to the logic layer.

The moment you have a unique item which is not actually unique, this is broken, and its real subtle. ldd for linux for example has two different but similar such types of these errors that remain unfixed (for over 10 years) because they weren't viewed as errors by the maintainers. This is to say that even long-term professional programmers (likely non-engineers) often lack in recognizing these types of foundational errors.

The result is the output of that utility prevents useful passing to any further automation because of the non-deterministically structured output. Specifically, the null token, and in-memory kernel structure tokens. Regex also requires these properties. You'll find there is at least one easily found instance of ldd on the ssh utility where you can't simply grep -ev to separate or filter material (to try and pigeonhole the output into a deterministic state), and even adding a DFA program sequentially can't be done to reverse this; a patch must occur at the point of error.

These crop up in production automation all the time, and usually are the most costly to fix given the required expertise needed to recognize the error. If determinism isn't present, no automation further downstream can be guaranteed to work. Determinism lets you constrain or expand the scope of a system in systems to narrow and home in on where the failure occurs.

Troubleshooting is an abstract application of testing for determinism, and you can easily tell when a problem won't have this tool available by probing inputs and outputs. In the absence of this property, you only have guess and check which requires intimate knowledge of the entirety of the system at all levels of abstraction. This is most costly in time given such documentation is almost never available.

As a final real-world example, consider an excel roster of employees at a large company, where you are only given the name of a person to shut down their account. What do you do when one person has the same name? What can you do without further input? Nothing. If you shut down both accounts, your fired, if you shut down the wrong account your fired, you have an indeterminable state.

The interactive layer is a lot more forgiving than the automatic layer because people can recognize when we need to get or provide more information.

Hopefully this clarifies your understanding.

kbelder 2 days ago | parent [-]

I don't see anything you said that indicates the OP was incorrect in any way.

trod1234 2 days ago | parent [-]

If that is the case, then you didn't read or comprehend what was actually said, and no one can tailor a response to people who can't read and comprehend.

There are important distinctions, its beyond the scope for me to try and guess at where that failure of comprehension might be for an individual such as yourself.

Basic reading comprehension would note: Properties are not individual inputs, they apply to the whole system as a relationship between input and output, individual inputs cannot define properties.

"Chaos" has a very rigorous definition (changes in small inputs lead to large changes in outputs).

"Injection of non-determinism" is only correct if it included a reference that determinism is built-in to all computation which is not a common understanding. Without that reference, the context improperly includes an indeterminable indirection resulting in fallacy.

The two are unrelated and independent to the context of the conversation or determinism, and so defining such understanding in those terms would result in fallacy (by improper isolation), delusion, or hallucination.

These are fundamental errors in reasoning and by extension understanding.

The correct, on firm foundations understanding, was provided. It is on the individual without knowledge to come into a conversation with the bare minimum requirements for comprehension based in rational thought and practice.

Edit: No amount of down-voting will change the truth of this, though I understand why someone would want useful knowledge to be hidden.

tacker2000 2 days ago | parent [-]

I downvoted you because your tone is unnecessarily harsh and rude.

trod1234 2 days ago | parent [-]

You should honestly re-evaluate and re-calibrate your measure of tone in moderation and relation to everything else.

Terse is not harsh or rude, its condensed, which carries a fine distinction.

Most business people and professionals speak this way; especially when it comes down to the objective facts which are not in question.

The facts and the effort towards minimization of cost for all parties in a communication conveys a overall respect, its extra effort I didn't have to provide which gets towards a specific goal as a whole for everyone involved in the communication's benefit.

If there is a mistake made on either parties part, its not harsh or rude to point out the mistake in such unambiguous format, or where that's not possible due to a deficit to point out why generally (such as a dependency not met).

Elaborating in great detail repeated or otherwise would be condescending, on the opposite side personal haranguing would be coercive imposition of cost. Lying by omission or commission would be the worst.

You'll note I did neither of these things, which is the socially acceptable way to handle it, and does not merit actions that were done. I pointed out the errors in comprehension, in the most minimal unambiguous way possible.

The only generally understood acceptable middle-ground in those two extremes is terse and to the point, and when you eliminate both sides and the middle ground, you classify all communication as harsh and rude which is an absurdity.

People cannot read other people's minds, and the point of communication is to convey meaning in a way the parties involved can use it towards their own ends beneficially if they choose, without unnecessary third-party interference.

The reflected appraisal is beneficial to all people involved.

zbentley 2 days ago | parent [-]

> Terse is not harsh or rude, its condensed, which carries a fine distinction.

Ironically, I suspect that terseness would improve your communication here. The significant repetitiveness and length of your posts is both contributing to others’ confusion and giving you more opportunities to be rude.

I’m commenting as opposed to downvoting and moving on because I do think there’s some interesting substance in what you wrote, but it needs an edit pass—for politeness/assumption of positive regard as well as brevity—before it’s in any way useful communication.

trod1234 a day ago | parent [-]

> but it needs an edit pass - for politeness/assumption of positive regard.

We will have to disagree.

I have appreciated that, though the post isn't repetitive, each point has a fine and different nuance its meant to convey, and a single tie-in back to the overarching point/theme at the end signaling the completion of the idea; this is a very common writing style/structure when conveying information.

The problem with doing as you mention is that any further reduction would introduce ambiguity, and fragmentary thought, through improper generalization/isolation. These would then be latched upon towards subtle harassment attacks, like nullification, which have become all too popular on all platforms today.

You'll notice the people I responded to don't provide reflected appraisal, as someone earnest would, and that is needed to write or tailor responses towards a common audience level, or bridge the comprehensive gap. This is a failing on their part, not mine.

They are in all likelihood either bots seeking to remove useful information, or are doing so purposefully following a critical theory mindset which are a particularly destructive engrammed set of trauma loops, though a rare few manage to crawl and pull themselves out of it towards actual reasoning when the mistakes in reasoning are brought to light.

Politeness and positive regard aren't what you seem to think. Washington wrote about politeness, and aside from the dated material, it still holds up today.

You will find much in his 110 Rules of Civility which constitute politeness which are present in my writings.

The 'disarmed politeness' you seem to want is based in an impossibility when you strip all the indirection and contradiction away. The brevity you seem to want ignores the fine nuance/comprehensiveness needed to be polite, and the resulting outcome of doing either naturally leads to "lies to children", and the imposition of harassment for volunteering useful information, something I won't do disarmed. An impasse.

Charity is provided and given solely on the terms of the giver, and not at the barrel of a gun, blackmail, or inherent threat thereof and volunteers stop giving when it costs them more than they were willing to give, and its an individual decision.

I'll have to remember Mathew 7:6 when next I consider providing such charity, though I'm glad and appreciate you letting me know you found the substance useful. So few people do this and it is appreciated.