yup, zstd is better. Overall use zstd for pretty much anything that can benefit from a general purpose compression. It's a beyond excellent library, tool, and an algorithm (set of).

Brotli w/o a custom dictionary is a weird choice to begin with.

▲ adzm 7 hours ago | parent | next [-]

Brotli makes a bit of sense considering this is a static asset; it compresses somewhat more than zstd. This is why brotli is pretty ubiquitous for precompressed static assets on the Web.

That said, I personally prefer zstd as well, it's been a great general use lib.

▲ stonogo 8 minutes ago | parent | next [-]

brotli is ubiquitous because Google recommends it. While Deflate definitely sucks and is old, Google ships brotli in Chrome, and since Chrome is the de facto default platform nowadays, I'd imagine it was chosen because it was the lowest-effort lift.

Nevertheless, I expect this to be JBIG2 all over again: almost nobody will use this because we've got decades of devices and software in the wild that can't, and 20% filesize savings is pointless if your destination can't read the damn thing.

▲ dist-epoch 6 hours ago | parent | prev [-]

You need to crank up zstd compression level.

zstd is Pareto better than brotli - compresses better and faster

▲ atiedebee 5 hours ago | parent | next [-]

I thought the same, so I ran brotli and zstd on some PDFs I had laying around.

  brotli 1.0.7 args: -q 11 -w 24
  zstd v1.5.0  args: --ultra -22 --long=31 
                 | Original | zstd    | brotli
  RandomBook.pdf | 15M      | 4.6M    | 4.5M
  Invoice.pdf    | 19.3K    | 16.3K   | 16.1K

I made a table because I wanted to test more files, but almost all PDFs I downloaded/had stored locally were already compressed and I couldn't quickly find a way to decompress them.

Brotli seemed to have a very slight edge over zstd, even on the larger pdf, which I did not expect.

▲ mort96 4 hours ago | parent | next [-]

EDIT: Something weird is going on here. When compressing zstd in parallel it produces the garbage results seen here, but when compressing on a single core, it produces result competitive with Brotli (37M). See: https://news.ycombinator.com/item?id=46723158

I did my own testing where Brotli also ended up better than ZSTD: https://news.ycombinator.com/item?id=46722044

Results by compression type across 55 PDFs:

    +------+------+-----+------+--------+
    | none | zstd | xz  | gzip | brotli |
    +------|------|-----|------|--------|
    | 47M  | 45M  | 39M | 38M  | 37M    |
    +------+------+-----+------+--------+

▲ Thoreandan an hour ago | parent | prev | next [-]

Does your source .pdf material have FlateDecode'd chunks or did you fully uncompress it?

▲ mrspuratic 3 hours ago | parent | prev | next [-]

> I couldn't quickly find a way to decompress them

    pdftk in.pdf output out.pdf decompress

▲ order-matters 5 hours ago | parent | prev [-]

Whats the assumption we can potentially target as reason for the counter-intuitive result?

that data in pdf files are noisy and zstd should perform better on noisy files?

▲

jeffbee 4 hours ago | parent [-]

What's counter-intuitive about this outcome?

▲

order-matters 4 hours ago | parent [-]

maybe that was too strongly worded but there was an expectation for zstd to outperform. So the fact it didnt means the result was unexpected. i generally find it helpful to understand why something performs better than expected.

▲

mort96 4 hours ago | parent [-]

Isn't zstd primarily designed to provide decent compression ratios at amazing speeds? The reason it's exciting is mainly that you can add compression to places where it didn't necessarily make sense before because it's almost free in terms of CPU and memory consumption. I don't think it has ever had a stated goal of beating compression ratio focused algorithms like brotli on compression ratio.

	▲	sgerenser 3 hours ago \| parent [-]
		I actually thought zstd was supposed to be better than Brotli in most cases, but a bit of searching reveals you're right... Brotli, especially at the highest compression levels (10/11), often exceeds zstd at the highest compression levels (20-22). Both are very slow at those levels, although perfectly suitable for "compress once, decompress many" applications which the PDF spec is obviously one of them.

▲ itsdesmond 3 hours ago | parent | prev | next [-]

> Pareto

I don’t think you’re using that correctly.

	▲	wizzwizz4 an hour ago \| parent [-]
		It's correct use of Pareto, short for Pareto frontier, if the claim being made is "for every needed compression ratio, zstd is faster; and for every needed time budget, zstd is faster". (Whether this claim is true is another matter.)

▲ DetroitThrow 5 hours ago | parent | prev | next [-]

I love zstd but this isn't necessarily true.

▲ dchest 5 hours ago | parent | prev | next [-]

Not with small files.

▲ jeffbee 6 hours ago | parent | prev [-]

Are you sure? Admittedly I only have 1 PDF in my homedir, but no combination of flags to zstd gets it to match the size of brotli's output on that particular file. Even zstd --long --ultra -22.

▲ deepsun an hour ago | parent | prev | next [-]

Brotli compresses my files way better, but it's doing it way slower. Anyway, universal statement "zstd is better" is not valid.

▲ greenavocado 7 hours ago | parent | prev [-]

This bizzare move has all the hallmarks of embrace-extend-extinguish rather than technical excellence