| ▲ | terrelln 2 hours ago | |||||||
I've figured out the issue. Use `wc -c` instead of `du`. I can repro on my Mac with these steps with either `zstd` or `gzip`:
When a file is overwritten, the on-disk size is bigger. I don't know why. But you must have ran zstd's benchmark twice, and every other compressor's benchmark once.I'm a zstd developer, so I have a vested interest in accurate benchmarks, and finding & fixing issues :) | ||||||||
| ▲ | mort96 an hour ago | parent [-] | |||||||
Interesting! It doesn't seem to be only about overwriting, I can be in a directory without any .zst files and run the command to compress 55 files in parallel and it's still 45M according to 'du -h'. But you're right, 'wc -c' shows 38809999 bytes regardless of whether 'du -h' shows 45M after a parallel compression or 38M after a sequential compression. My mental model of 'du' was basically that it gives a size accurate to the nearest 4k block, which is usually accurate enough. Seems I have to reconsider. Too bad there's no standard alternative which has the interface of 'du' but with byte-accurate file sizes... | ||||||||
| ||||||||