| ▲ | inciampati 2 days ago | |||||||
Markov chains have exponential falloff in correlations between tokens over time. That's dramatically different than real text which contains extremely long range correlations. They simply can't model long range correlations. As such, they can't be guided. They can memorize, but not generalize. | ||||||||
| ▲ | kittikitti 2 days ago | parent | next [-] | |||||||
As someone who developed chatbots with HMM's and the Transformers algorithms, this is a great and succinct answer. The paper, Attention Is All You Need, solved this drawback. | ||||||||
| ||||||||
| ▲ | zwaps 2 days ago | parent | prev [-] | |||||||
This is the correct answer | ||||||||