Remix.run Logo
jamesblonde 6 days ago

The difference is scale-out metadata in the filesystem. Alluxio uses Raft, i believe, for metadata - that has to fit on a single server.

rfoo 6 days ago | parent [-]

3FS isn't particularly fast in mdbench, though. Maybe our FDB tuning skill is what to blame, or FUSE, I don't know, but it doesn't really matter.

The truly amazing part for me is combining NVMe SSD + RDMA + supports reading a huge batch of random offsets from a few already opened huge files efficiently. This is how you get your training boxes consuming 20~30GiB/s (and roughly 4 million IOPS).

rjzzleep 6 days ago | parent [-]

FUSE has traditionally been famously slow. I remember there were some changes that supposedly made it faster, but maybe that was just a certain fuse implementation.

jamesblonde 6 days ago | parent [-]

The block size is 4KB by default, which is a killer. We set it to 1MB or so by default - makes a huge difference.

https://github.com/logicalclocks/hopsfs-go-mount