| ▲ | lukev 2 days ago | |
Other comments in this thread do a good job explaining the differences in the Markov algorithm vs the transformer algorithm that LLMs use. I think it's worth mentioning that you have indeed identified a similarity, in that both LLMs and Markov chain generators have the same algorithm structure: autoregressive next-token generation. Understanding Markov chain generators is actually a really really good step towards understanding how LLMs work, overall, and I think its a really good pedagogical tool. Once you understand Markov generating, doing a bit of handwaving to say "and LLMs are just like this except with a more sophisticated statistical approach" has the benefit of being true, demystifying LLMs, and also preserving a healthy respect for just how powerful that statistical model can be. | ||