| ▲ | XYen0n 2 hours ago |
| The OCI manifest references the hashes of these compressed layers, and re-compressing them does not guarantee obtaining the same hash |
|
| ▲ | mort96 an hour ago | parent | next [-] |
| If that's the purpose, couldn't you store the hash and throw away the compressed image? (As others said, compression is deterministic for the same algorithm, parameters and input data) |
| |
| ▲ | a_t48 27 minutes ago | parent [-] | | Zstd for example only promises determinism on the same version of the library. I've personally seen the hashes mutate between pull and export. Things like tar padding also make a difference. Really, the thing to do is to hash on the _uncompressed_ data and let compression be a transport/registry detail. That's what I've done, at least. | | |
| ▲ | mort96 6 minutes ago | parent [-] | | I didn't know that about zstd, that's a bit unfortunate. Tar isn't related here though, we're talking about compression not archival formats |
|
|
|
| ▲ | flakes 2 hours ago | parent | prev [-] |
| Recompressing should be guaranteed deterministic. It’s the packing/unpacking of tar archives to/from directories on disk that leads to the non-determinism (such as timestamps and ownership metadata). If the tar is left intact, both zstd and gzip should produce byte for byte identical outputs given the same compression parameters. |
| |
| ▲ | XYen0n 36 minutes ago | parent [-] | | You are correct; I confused archiving with compression. However, even considering only the compression process, same compression parameters cannot be guaranteed, as it is unknown which compression parameters the image publisher used. |
|