| ▲ | shawnz 7 hours ago | |||||||
Another fun application of combining LLMs with arithmetic coding is steganography. Here's a project I worked on a while back which effectively uses the opposite technique of what's being done here, to construct a steganographic transformation: https://github.com/shawnz/textcoder | ||||||||
| ▲ | akoboldfrying 4 hours ago | parent [-] | |||||||
Cool! It creates very plausible encodings. > The Llama tokenizer used in this project sometimes permits multiple possible tokenizations for a given string. Not having tokens be a prefix code is thoroughly unfortunate. Do the Llama team consider it a bug? I don't see how to rectify the situation without a full retrain, sadly. | ||||||||
| ||||||||