| I just did some interactive shell loops and globs to compress everything and output CSV which I processed into an ASCII table, so I don't exactly have a pipeline I can modify and re-run the tests with compression speeds added ... but I can run some more interactive shell-glob-and-loop-based analysis to give you decompression speeds: ~/tmp/pdfbench $ hyperfine --warmup 2 \
'for x in zst/*; do zstd -d >/dev/null <"$x"; done' \
'for x in gz/*; do gzip -d >/dev/null <"$x"; done' \
'for x in xz/*; do xz -d >/dev/null <"$x"; done' \
'for x in br/*; do brotli -d >/dev/null <"$x"; done'
Benchmark 1: for x in zst/*; do zstd -d >/dev/null <"$x"; done
Time (mean ± σ): 164.6 ms ± 1.3 ms [User: 83.6 ms, System: 72.4 ms]
Range (min … max): 162.0 ms … 166.9 ms 17 runs
Benchmark 2: for x in gz/*; do gzip -d >/dev/null <"$x"; done
Time (mean ± σ): 143.0 ms ± 1.0 ms [User: 87.6 ms, System: 43.6 ms]
Range (min … max): 141.4 ms … 145.6 ms 20 runs
Benchmark 3: for x in xz/*; do xz -d >/dev/null <"$x"; done
Time (mean ± σ): 981.7 ms ± 1.6 ms [User: 891.5 ms, System: 93.0 ms]
Range (min … max): 978.7 ms … 984.3 ms 10 runs
Benchmark 4: for x in br/*; do brotli -d >/dev/null <"$x"; done
Time (mean ± σ): 254.5 ms ± 2.5 ms [User: 172.9 ms, System: 67.4 ms]
Range (min … max): 252.3 ms … 260.5 ms 11 runs
Summary
for x in gz/*; do gzip -d >/dev/null <"$x"; done ran
1.15 ± 0.01 times faster than for x in zst/*; do zstd -d >/dev/null <"$x"; done
1.78 ± 0.02 times faster than for x in br/*; do brotli -d >/dev/null <"$x"; done
6.87 ± 0.05 times faster than for x in xz/*; do xz -d >/dev/null <"$x"; done
As expected, xz is super slow. Gzip is fastest, zstd being somewhat slower, brotli slower again but still much faster than xz. +-------+-------+--------+-------+
| gzip | zstd | brotli | xz |
+-------+-------+--------+-------+
| 143ms | 165ms | 255ms | 982ms |
+-------+-------+--------+-------+
I honestly expected zstd to win here. |
| |
| ▲ | terrelln 3 hours ago | parent | next [-] | | Zstd should not be slower than gzip to decompress here. Given that it has inflated the files to be bigger than the uncompressed data, it has to do more work to decompress. This seems like a bug, or somehow measuring the wrong thing, and not the expected behavior. | | |
| ▲ | mort96 2 hours ago | parent [-] | | It seems like zstd is somehow compressing really badly when many zstd processes are run in parallel, but works as expected when run sequentially: https://news.ycombinator.com/item?id=46723158 Regardless, this does not make a significant difference. I ran hyperfine again against a 37M folder of .pdf.zst files, and the results are virtually identical for zstd and gzip: +-------+-------+--------+-------+
| gzip | zstd | brotli | xz |
+-------+-------+--------+-------+
| 142ms | 165ms | 269ms | 994ms |
+-------+-------+--------+-------+
Raw hyperfine output: ~/tmp/pdfbench $ du -h zst2 gz xz br
37M zst2
38M gz
38M xz
37M br
~/tmp/pdfbench $ hyperfine ...
Benchmark 1: for x in zst2/*; do zstd -d >/dev/null <"$x"; done
Time (mean ± σ): 164.5 ms ± 2.3 ms [User: 83.5 ms, System: 72.3 ms]
Range (min … max): 162.3 ms … 172.3 ms 17 runs
Benchmark 2: for x in gz/*; do gzip -d >/dev/null <"$x"; done
Time (mean ± σ): 142.2 ms ± 0.9 ms [User: 87.4 ms, System: 43.1 ms]
Range (min … max): 140.8 ms … 143.9 ms 20 runs
Benchmark 3: for x in xz/*; do xz -d >/dev/null <"$x"; done
Time (mean ± σ): 993.9 ms ± 9.2 ms [User: 896.7 ms, System: 99.1 ms]
Range (min … max): 981.4 ms … 1007.2 ms 10 runs
Benchmark 4: for x in br/*; do brotli -d >/dev/null <"$x"; done
Time (mean ± σ): 269.1 ms ± 8.8 ms [User: 176.6 ms, System: 75.8 ms]
Range (min … max): 261.8 ms … 287.6 ms 10 runs
| | |
| ▲ | terrelln an hour ago | parent [-] | | Ah I understand. In this benchmark, Zstd's decompression time is 284 MB/s, and Gzip's is 330 MB/s. This benchmark is likely dominated by file IO for the faster decompressors. On the incompressible files, I'd expect decompression of any algorithm to approach the speed of `memcpy()`. And would generally expect zstd's decompression speed to be faster. For example, on a x86 core running at 2GHz, Zstd is decompressing a file at 660 MB/s, and on my M1 at 1276 MB/s. You could measure locally either using a specialized tool like lzbench [0], or for zstd by just running `zstd -b22 --ultra /path/to/file`, which will print the compression ratio, compression speed, and decompression speed. [0] https://github.com/inikep/lzbench |
|
| |
| ▲ | noname120 3 hours ago | parent | prev [-] | | Thanks a lot. Interestingly Brotli’s author mentioned here that zstd is 2× faster at decompressing, which roughly matches your numbers: https://news.ycombinator.com/item?id=46035817 I’m also really surprised that gzip performs better here. Is there some kind of hardware acceleration or the like? |
|