▲ | rini17 4 days ago | |||||||
This might in general be a good preprocessing step to check for punctuation repeating in fixed intervals and remove it, and restore after decompression. | ||||||||
▲ | vintermann a day ago | parent | next [-] | |||||||
That turns in into specialized compression, which DNA already has plenty of. Many forms of specialized compression even allow string-related queries directly on the compressed data. | ||||||||
| ||||||||
▲ | bede 3 days ago | parent | prev [-] | |||||||
Yes, it sounds like 7-Zip/LZMA can do this using custom filters, among other more exotic (and slow) statistical compression approaches. |