Remix.run Logo
Dylan16807 3 hours ago

With the static huffman table, the decompressed output is very similar from what you'd get with this algorithm:

  loop:
     chance to exit
     chance to output a random byte
     chance to repeat some previous bytes
If decompression succeeds, the chance that 95% of the output can disassemble is not very different from random noise, except sometimes it'll get lucky and repeat a valid opcode 120 times so maybe it's slightly higher.

But the chance that 100% can disassemble gets more complicated. It's two factors multiplied together, the chance decompression succeeds, and then the chance that the output can disassemble.

If decompression is successful, the output will have a lot fewer random bytes than 128. The repetitions have some impact, but less impact. So the chance that the output disassembles should be higher than the chance 128 random bytes disassemble.

With DEFLATE in particular, decompression fails 95+% of the time, so the overall odds of decompress+decode suck.

This is partially mitigated by DEFLATE exiting early a lot of the time, but only partially.

If you made a DEFLATE-like decompressor that crashes less and exits early more often, it would win by a mile.

If you made a DEFLATE-like decompressor that crashes rarely but doesn't exit early, it would likely be able to win the 100% decoding test. Some of the random bytes go into the output, some get used up on less dangerous actions than outputting random bytes, overall success rate goes up.