Remix.run Logo
bjackman 9 hours ago

What, um... Are... Are people using samba to sync model weights between cluster nodes...?

topspin 5 hours ago | parent | next [-]

Why not? SMB is no slouch. Microsoft has taken network storage performance very seriously for a long time now. Back in the day, Microsoft and others (NetApp, for instance,) worked hard to extend and optimize SMB and deliver efficient, high throughput file servers. I haven't kept up with the state of the art recently, but I know there have been long stretches where SMB consistently led the field in benchmark testing. It also doesn't hurt that Microsoft has a lot of pull with hardware manufacturers to see their native protocols remain tier 1 concerns at all times.

whizzter 3 hours ago | parent [-]

I think a lot of people have a hard time differentiating the underlying systems from what they _see_ and use it to bash MS products.

I heard that it was perhaps recently fixed, but copying many small files was multiple times faster to do via something like Total Commander vs the built in File Explorer (large files goes equally fast).

People seeing how slow Explorer was to copy would probably presume that it was a lower level Windows issue if they had a predisposed bias against Microsoft/Windows.

My theory about Explorers sluggishness is that they added visual feedback to the copying process at some point, and for whatever reason that visual feedback is synchronous/slow (perhaps capped at the framerate, thus 60 files a second), whilst TC does updating in the background and just renderers status periodically whilst the copying thread(s) can run at full speed of what the OS is capable of under the hood.

phantasmish an hour ago | parent | next [-]

I dunno about Windows Explorer, but macOS’ finder seems to hash completed transfers over SMB (this must be something it can trigger the receiver to do in SMB itself, it doesn’t seem slow enough for the sender to be doing it on a remote file) and remove transferred files that don’t pass the check.

I could see that or other safety checks making one program slower than another that doesn’t bother. Or that sort of thing being an opportunity for a poor implementation that slows everything down a bunch.

p_l 3 hours ago | parent | prev [-]

A problem with Explorer, that it also shares with macOS Finder[1], is that they are very much legacy applications with features piled on top, and Explorer was never expected to be used for heavy I/O work and tends to do things the slower way possible, including doing things in ways that are optimized for "random first time user of windows 95 who will have maybe 50 files in a folder"

[1] Finder has parts that show continued use of code written for MacOS 9 :V

whizzter 3 hours ago | parent | prev | next [-]

Plenty of other workloads that benefit from high performance file access and with networks speeds and disk speeds getting higher whilst single-core perf has more or less plateaued in comparison, it's thus more and more important to support data-paths where the kernel switching won't become a bottleneck.

ycombinatrix 8 hours ago | parent | prev [-]

Dunno but I have used samba to load model weights from my NAS