| ▲ | chpatrick 10 hours ago | ||||||||||||||||
I think you're confusing Markov chains and "Markov chain text generators". A Markov chain is a mathematical structure where the probabilities of going to the next state only depend on the current state and not the previous path taken. That's it. It doesn't say anything about whether the probabilities are computed by a transformer or stored in a lookup table, it just exists. How the probabilities are determined in a program doesn't matter mathematically. | |||||||||||||||||
| ▲ | saithound 7 hours ago | parent | next [-] | ||||||||||||||||
Just a heads-up: this is not the first time somebody has to explain Markov chains to famouswaffles on HN, and I'm pretty sure it won't be the last. Engaging further might not be worth it. | |||||||||||||||||
| |||||||||||||||||
| ▲ | famouswaffles 5 hours ago | parent | prev [-] | ||||||||||||||||
'A Markov chain is a mathematical structure where the probabilities of going to the next state only depend on the current state and not the previous path taken.' My point, which seems so hard to grasp for whatever reason is that In a Markov chain, state is a well defined thing. It's not a variable you can assign any property to. LLMs do depend on the previous path taken. That's the entire reason they're so useful! And the only reason you say they don't is because you've redefined 'state' to include that previous path! It's nonsense. Can you not see the circular argument? The state is required to be a fixed, well-defined element of a structured state space. Redefining the state as an arbitrarily large, continuously valued encoding of the entire history is a redefinition that trivializes the Markov property, which a Markov chain should satisfy. Under your definition, any sequential system can be called Markov, which means the term no longer distinguishes anything. | |||||||||||||||||