Remix.run Logo
CDC: Why Decompression Is Worth the Complexity(wael.nasreddine.com)
2 points by kalbasit 2 hours ago | 1 comments
kalbasit 2 hours ago | parent [-]

Building a Nix cache server and faced a classic system design dilemma: chunk compressed data (fast/simple) or decompress first (slow/complex)?

I tested 60k+ NAR files to find out.

Compressed: 6.4% dedup hit rate Uncompressed: 47.8% dedup hit rate

Decompression wins, saving 18% in total storage.

(P.S. To handle the pipeline throughput, I also built the fastest FastCDC implementation in Go: https://github.com/kalbasit/fastcdc)