Remix.run Logo
Charon77 7 hours ago

The issue with Markov Chain is you can't get good next token prediction on long enough context because once you see the last 1000 words instead of just 2, it's quite unlikely that your 'frequency' is populated for that exact combination, and markov chain don't work on token embedding that allows some encoding of meaning.

AlecSchueler an hour ago | parent [-]

> and markov chain don't work on token embedding that allows some encoding of meaning.

Working on an "encoding of meaning" sure sounds a lot like reasoning.