Remix.run Logo
mort96 2 hours ago

It seems like zstd is somehow compressing really badly when many zstd processes are run in parallel, but works as expected when run sequentially: https://news.ycombinator.com/item?id=46723158

Regardless, this does not make a significant difference. I ran hyperfine again against a 37M folder of .pdf.zst files, and the results are virtually identical for zstd and gzip:

    +-------+-------+--------+-------+
    | gzip  | zstd  | brotli | xz    |
    +-------+-------+--------+-------+
    | 142ms | 165ms | 269ms  | 994ms |
    +-------+-------+--------+-------+
Raw hyperfine output:

    ~/tmp/pdfbench $ du -h zst2 gz xz br
     37M    zst2
     38M    gz
     38M    xz
     37M    br
    
    ~/tmp/pdfbench $ hyperfine ...
    Benchmark 1: for x in zst2/*; do zstd -d >/dev/null <"$x"; done
      Time (mean ± σ):     164.5 ms ±   2.3 ms    [User: 83.5 ms, System: 72.3 ms]
      Range (min … max):   162.3 ms … 172.3 ms    17 runs
    
    Benchmark 2: for x in gz/*; do gzip -d >/dev/null <"$x"; done
      Time (mean ± σ):     142.2 ms ±   0.9 ms    [User: 87.4 ms, System: 43.1 ms]
      Range (min … max):   140.8 ms … 143.9 ms    20 runs
    
    Benchmark 3: for x in xz/*; do xz -d >/dev/null <"$x"; done
      Time (mean ± σ):     993.9 ms ±   9.2 ms    [User: 896.7 ms, System: 99.1 ms]
      Range (min … max):   981.4 ms … 1007.2 ms    10 runs
    
    Benchmark 4: for x in br/*; do brotli -d >/dev/null <"$x"; done
      Time (mean ± σ):     269.1 ms ±   8.8 ms    [User: 176.6 ms, System: 75.8 ms]
      Range (min … max):   261.8 ms … 287.6 ms    10 runs
terrelln 42 minutes ago | parent [-]

Ah I understand. In this benchmark, Zstd's decompression time is 284 MB/s, and Gzip's is 330 MB/s. This benchmark is likely dominated by file IO for the faster decompressors.

On the incompressible files, I'd expect decompression of any algorithm to approach the speed of `memcpy()`. And would generally expect zstd's decompression speed to be faster. For example, on a x86 core running at 2GHz, Zstd is decompressing a file at 660 MB/s, and on my M1 at 1276 MB/s.

You could measure locally either using a specialized tool like lzbench [0], or for zstd by just running `zstd -b22 --ultra /path/to/file`, which will print the compression ratio, compression speed, and decompression speed.

[0] https://github.com/inikep/lzbench