| ▲ | currymj 2 days ago | |
bigram-trigram language models (with some smoothing tricks to allow for out-of-training-set generalization) were state of the art for many years. Ch. 3 of Jurafsky's textbook (which is modern and goes all the way to LLMs, embeddings etc.) is good on this topic. https://web.stanford.edu/~jurafsky/slp3/ed3book_aug25.pdf I don't know the history but I would guess there have been times (like the 90s) when the best neural language models were worse than the best trigram language models. | ||