Remix.run Logo
dabinat 9 hours ago

The problem with using S3 as a filesystem is that it’s immutable, and that hasn’t changed with S3 Files. So if I have a large file and change 1 byte of it, or even just rename it, it needs to upload the entire file all over again. This seems most useful for read-heavy workflows of files that are small enough to fit in the cache.

wolttam 9 hours ago | parent | next [-]

That’s not that different than CoW filesystems - there is no rule that files must map 1:1 to objects; you can (transparently) divide a file into smaller chunks to enable more fine grained edits.

direwolf20 3 hours ago | parent | next [-]

But this doesn't

9 hours ago | parent | prev [-]
[deleted]
jamesblonde 4 hours ago | parent | prev | next [-]

Files can be immutable if you have mutable metadata - but S3 does not have mutable metadata, so you can't rename a directory without a full copy of all its contents.

Immutable files can be solved by chunking them, allowing files to be opened and appended to - we do this in HopsFS. However, random writes are typically not supported in scaleout metadata file systems - but rarely used by POSIX clients, thankfully.

aforwardslash 5 hours ago | parent | prev [-]

Depends how you implement the fs layer on top of s3; as a quick example, I've done a couple of implementations of exactly that, where a file is chunked into multiple s3 objects; this allows for CoW semantics if required, and parallel upload/downloads; in the end it heavily depends on your use case