Remix.run Logo
marginalia_nu 6 hours ago

Zip with no compression is a nice contender for a container format that shouldn't be slept on. It effectively reduces the I/O, while unlike TAR, allowing direct random to the files without "extracting" them or seeking through the entire file, this is possible even via mmap, over HTTP range queries, etc.

You can still get the compression benefits by serving files with Content-Encoding: gzip or whatever. Though it has builtin compression, you can just not use that and use external compression instead, especially over the wire.

It's pretty widely used, though often dressed up as something else. JAR files or APK files or whatever.

I think the articles complaints about lacking unix access rights and metadata is a bit strange. That seems like a feature more than a bug, as I wouldn't expect this to be something that transfers between machines. I don't want to unpack an archive and have to scrutinize it for files with o+rxst permissions, or have their creation date be anything other than when I unpacked them.

1718627440 5 hours ago | parent | next [-]

Isn't this what is already common in the Python community?

> I don't want to unpack an archive and have to scrutinize it for files with o+rxst permissions, or have their creation date be anything other than when I unpacked them.

I'm the opposite, when I pack and unpack something, I want the files to be identical including attributes. Why should I throw away all the timestamps, just because the file were temporarily in an archive?

password4321 3 hours ago | parent | next [-]

> Why should I throw away all the timestamps, just because the file were temporarily in an archive?

In case anyone is unaware, you don't have to throw away all the timestamps when using "zip with no compression". The metadata for each zipped file includes one timestamp (originally rounded to even number of seconds in local time).

I am a big last modified timestamp fan and am often discouraged that scp, git, and even many zip utilities are not (at least by default).

rcxdude an hour ago | parent [-]

git updates timestamps in part by necessity of compatibility with build systems. If it applied the timestamp of when the file was last modified on checkout then most build systems would break if you checked out an older commit.

rustyhancock 3 hours ago | parent | prev [-]

Yes, it's a lossy process.

If your archive drops it you can't get it back.

If you don't want it you can just chmod -R u=rw,go=r,a-x

1718627440 3 hours ago | parent [-]

> If your archive drops it you can't get it back.

Hence, the common archive format is tar not zip.

LtdJorge an hour ago | parent | prev | next [-]

Doesn’t ZIP have all the metadata at the end of the file, requiring some seeking still?

conradludgate 37 minutes ago | parent [-]

Yes, but it's an O(1) random access seek rather than O(n) scanning seek

stabbles 6 hours ago | parent | prev [-]

> Zip with no compression is a nice contender for a container format that shouldn't be slept on

SquashFS with zstd compression is used by various container runtimes, and is popular in HPC where filesystems often have high latency. It can be mounted natively or with FUSE, and the decompression overhead is not really felt.

ciupicri 3 hours ago | parent [-]

Wouldn't you still have a lot of syscalls?

stabbles 3 hours ago | parent | next [-]

Yes, but with much lower latency. The squashfs file ensures the files are close together and you benefit from fs cache a lot.

LtdJorge an hour ago | parent | prev [-]

You then use io_uring