Remix.run Logo
fabian2k 5 hours ago

> This difference is particularly noticeable with multiple images sharing the same base layers. With legacy storage drivers, shared base layers were stored once locally, and reused images that depended on them. With containerd, each image stores its own compressed version of shared layers, even though the uncompressed layers are still de-duplicated through snapshotters.

This seems like a really weird decision. If base images are duplicated for every image you have, that will add up quickly.

kodama-lens 4 hours ago | parent | next [-]

I think there is an Issue/PR right now to change this. See: https://github.com/containerd/containerd/issues/13307

epistasis 3 hours ago | parent [-]

Oh, very glad to see this, ML applications that were mentioned in it are exactly why I was thinking this was such a disastrous change.

However, the tedium of the reply chain reminds me why I tend to focus most energy on internal projects rather than external open source...

Docker may have been built for a specific type of use case that most developers are familiar with (e.g. web apps backed by a DB container) but containerization is useful across so much of computing that are very different. Something that seems trivial in the python/DB space, having one or two different small duplicates of OS layers, is very different once you have 30 containers for different models+code, and then ~100 more dev containers lying around as build artifacts from building and pushing, and pulling, each at ~10GB, that the inefficient new system is just painful.

The smallest PyTorch container I ever built was 1.8GB, and that was just for some CPU-only inference endpoints, and that took several hours of yak shaving to achieve, and after a month or two of development it had ballooned back to 8GB. Containers with CUDA, or using significant other AI/ML libraries, get really big. YAGNI is a great principle for your own code when writing from scratch, but YAGNI is a bit dangerous when there's been an entire ecosystem built on your product and things are getting rewritten from scratch, because the "you" is far larger than the developer making the change. Docker's core feature has always been reusable and composable layers, so seeing it abandoned seems that somebody took YAGNI far too extreme on their own corner of the computing world.

epistasis 5 hours ago | parent | prev | next [-]

This is hell for a lot of ML containers, that have gigabytes of CUDA and PyTorch. Before at least you could keep your code contained to a layer. But if I understand this correctly every code revision duplicates gigabytes of the same damn bloated crap.

spwa4 4 hours ago | parent [-]

If you have problems with 13 (I believe) GB of docker layers ... how do you deal with terabytes or petabytes of AI training data?

epistasis 4 hours ago | parent | next [-]

Petabytes of training data is only one application of PyTorch, which is going to use tens of thousands of containers, but...

Inference, development cycles, any of the application domains of PyTorch that don't involve training frontier models... all of those are complicated by excessive container layers.

But mostly dev really sucks with writing out an extra 10GB for a small code change.

StableAlkyne 4 hours ago | parent | prev | next [-]

You don't even need MB of training data for some ML applications. AI is the sexy thing nowadays, but neural networks (Torch is a NN library) are generally useful for even small regression and clarification problems.

For some problems you might even be able to get away with single digit numbers of training points (classic example of this regime being Physics-Informed Neural Networks)

Normal_gaussian 4 hours ago | parent | prev | next [-]

the training data is on a separate drive; or the training data isn't that large for this use case; or they aren't training.

0cf8612b2e1e 2 hours ago | parent | prev [-]

You don’t train petabytes on your laptop.

IsTom 5 hours ago | parent | prev | next [-]

Docker is already hogging a lot of disk space and needs to be pruned regularly. I can't imagine what's it's going to be like now.

embedding-shape an hour ago | parent | prev [-]

"really weird decision" seems like an understatement, I thought the entire point of the specific storage design with the whole layering shebang was so things could be shared? If you remove that, just get rid of layers as a whole, what's the point otherwise?