▲ | _Algernon_ 7 days ago | |
LLMs aren't Markov chains unless they have a context window of 1. >In probability theory and statistics, a Markov chain or Markov process is a stochastic process describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event | ||
▲ | mr_wiglaf 7 days ago | parent [-] | |
The tricky thing is you get to define the state. So if the "state" is the current word _and_ the previous 10 it is still "memoryless". So an LLM's context window is the state. It doesn't matter whether _we_ see parts of the state as called history, the markov chain doesn't care (they are all just different features). Edit: I could be missing important nuance that other people are pointing out in this thread! |