Remix.run Logo
pajko a day ago

Bzip2 performs exactly better because it rearranges the input to achieve better pattern matches: https://en.m.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_tran...

vintermann 7 hours ago | parent [-]

A number of identical copies of a string, but with random mutations propagating through it like a word ladder puzzle, is pretty close to best-case for BWT-based compressors.

But Bzip2 is also a pretty bad BWT-based compressor. Not only does it use block sizes from a time when 8mb memory was a lot, it does silly things which doesn't help compression at all.