| ▲ | Sohcahtoa82 20 hours ago | |
> The chain has never seen anything. The Markov chain is a table of probability distributions. You can create it by any means you see fit. There is no such thing as a "series" of tokens that has been seen by the chain. When I talk about the chain "seeing" a sequence, I mean that the sequence existed in the material that was used to generate the probability table. My instinct is to believe that you know this, but are being needlessly pedantic. My point is that if you're using a context length of two, if you prompt a Markov Chain with "my cat", but the sequence "my cat was" never appeared in the training material, than a Markov Chain will never choose "was" as the next word. This property is not true for LLMs. If you prompt an LLM with "my cat", then "was" has a non-zero chance of being chosen as the next word, even if "my cat was" never appeared in the training material. | ||