Remix.run Logo
Anon84 2 days ago

Not that different. In fact, you can use Markov Chain theory as an analytical tool to study LLMs: https://arxiv.org/abs/2410.02724

You could probably point your code to Google Books N-grams (https://storage.googleapis.com/books/ngrams/books/datasetsv3...) and get something that sounds (somewhat) reasonable.

JPLeRouzic a day ago | parent [-]

Thank you, this link (Google Books N-grams) looks very interesting.