| ▲ | thfuran 2 days ago | |||||||
>Markov Chains will ONLY generate sequences that existed in the source. A markov chain of order N will only generate sequences of length N+1 that were in the training corpus, but it is likely to generate sequences of length N+2 that weren't (unless N was too large for the training corpus and it's degenerate). | ||||||||
| ▲ | Sohcahtoa82 2 days ago | parent | next [-] | |||||||
Well yeah, but N+2 but the generation of the +2 loses the first part of N. If you use a context window of 2, then yes, you might know that word C can follow words A and B, and D can follow words B and C, and therefore generate ABCD even if ABCD never existed. But it could be that ABCD is incoherent. For example, if A = whales, B = are, C = mammals, D = reptiles. "Whales are mammals" is fine, "are mammals reptiles" is fine, but "Whales are mammals reptiles" is incoherent. The longer you allow the chain to get, the more incoherent it becomes. "Whales are mammals that are reptiles that are vegetables too". Any 3-word fragment of that sentence is fine. But put it together, and it's an incoherent mess. | ||||||||
| ||||||||
| ▲ | Isamu 2 days ago | parent | prev | next [-] | |||||||
Right, you can generate long sentences from a first-order markov model, and all of the transitions from one word to the next be in the training set but the full generated sentence may not. | ||||||||
| ▲ | Jensson a day ago | parent | prev [-] | |||||||
> A markov chain of order N will only generate sequences of length N+1 that were in the training corpus Depends on how you trained it, an LLM is also a markov chain. | ||||||||