| ▲ | astrange 2 days ago | |
> This is entirely different from saying it can only reproduce samples which it was trained on. It is not a memory machine that is surgically piecing together snippets of memorized samples. (That would be a mind bogglingly impressive machine!) You could create one of those using both a Markov chain and an LLM. | ||
| ▲ | godelski 16 hours ago | parent [-] | |
Though I enjoyed that paper, it's not quite the same thing. There's a bit more subtly to what I'm saying. To do a surgical patching you'd have to actually have a rich understanding of language but just not have the actual tools to produce words themselves. Think like the SciFi style robots that pull together clips or recordings to speak. Bumblebee from transformers might be the most well known example. But think hard about that because it requires a weird set of conditions and a high level of intelligence to perform the search and stitching. But speaking of Markov, we get that in LLMs through generation. We don't have conversations with them. Each chat is unique since you pass it the entire conversation. There's no memory. So the longer your conversations go the larger the token counts. That's Markovian ;) | ||