| ▲ | emodendroket 3 hours ago | |
LLMs are not deterministic at all. The same input leads to different outputs at random. But I think there’s still the question if this process is more similar to thought or a Markov chain. | ||
| ▲ | hackinthebochs 2 hours ago | parent [-] | |
They are deterministic in the sense that the inference process scores every word in the vocabulary in a deterministic manner. This score map is then sampled from according to the temperature setting. Non-determinism is artificially injected for ergonomic reasons. >But I think there’s still the question if this process is more similar to thought or a Markov chain. It's definitely far from a Markov chain. Markov chains treat the past context as a single unit, an N-tuple that has no internal structure. The next state is indexed by this tuple. LLMs leverage the internal structure of the context which allows a large class of generalization that Markov chains necessarily miss. | ||