| ▲ | 0cf8612b2e1e 2 days ago | |||||||||||||
Sure that specific text does not exist, but the discrete tokens that went into it would have been. If you similarly trained a Markov chain at the token level on a LLM sized corpus, it could make the same. Lacking an attention mechanism, the token probabilities would be terribly non constructive for the effort, but it is not impossible. | ||||||||||||||
| ▲ | Sohcahtoa82 2 days ago | parent [-] | |||||||||||||
Let's assume three things here: 1. The corpus contains every Trump speech. 2. The corpus contains everything ever written about XSS. 3. The corpus does NOT contain Trump talking about XSS, nor really anything that puts "Trump" and "XSS" within the same page. A Markov Chain could not produce a speech about XSS in the style of Trump. The greatest tuning factor for a Markov Chain is the context length. A short length (like 2-4 words) produces incoherent results because it only looks at the last 2-4 words when predicting the next word. This means if you prompted the chain with "Create a speech about cross-site scripting the style of Donald Trump", then even with a 4-word context, all the model processes is "style of Donald Trump". But the time it reached the end of the prompt, it's already forgotten the beginning of it. If you increase the context to 15, then the chain would produce nothing because "Create a speech about cross-site scripting in the style of Donald Trump" has never appeared in its corpus, so there's no data for what to generate next. The matching in a Markov Chain is discrete. It's purely a mapping of (series of tokens) -> (list of possible next tokens). If you pass in a series of tokens that was never seen in the training set, then the list of possible next tokens is an empty set. | ||||||||||||||
| ||||||||||||||