| ▲ | maxbond an hour ago | |
The escape algorithm here is very simple, you remove special tokens from the runtime tokenizer's vocabulary so that it's forced to encode them as multiple non-special tokens. (That doesn't actually mean the LLM won't treat them as special tokens though, so this isn't sufficient on it's own.) | ||
| ▲ | bashbjorn 3 minutes ago | parent [-] | |
Cool technique, but I'm not sure I'd call it simple. Doing this means that you can't just tokenize the string output of the chat template as one big string. You might need to tokenize things separately, and combine them after. | ||