Is modal running every single service inside gvisor?

I have heard that gvisor isn't recommended to run every single production but rather only some front facing or some other activities but it has some serious performance degradation which is why most end up using firecracker

This is really cool though, does this mean that we could probably have AI models that are snapshotted?

Are the states of checkpoint/recovery encrypted by default or how would that even work? Like what are the privacy aspects of it. I don't think even using something like modal would be the private llm that many people sometimes want on subreddits like localllama but the people dont have gpu. of course nothing beats privacy if you have your own gpu's but I'd be curious to know what people's thoughts are

▲

markasoftware 20 hours ago | parent [-]

the thing is modal is running untrusted containers, so there's not really a concept of "some front facing" containers. Any container running an untrusted workload is at high risk / is "front facing".

If Modal's customers' workloads are mainly GPU-bound, then the performance hit of gvisor isn't as big as it might be for other workloads. GPU activity does have to go through the fairly heavyweight nvproxy to be executed on the host, but most gpu activity is longer-lived async calls like running kernels so a bit of overhead in starting / retrieving the results from those calls can be tolerated.

▲

Imustaskforhelp 19 hours ago | parent [-]

Well if someone is gonna use Modal exactly for GPU purposes then I guess its okay but anything compute related just feels like it would have some issues performance wise

So I can agree that perhaps Modal might make sense for LLM's but they position themselves as sandbox including something like running python code etc. and some of this may be more intensive in workflows than others so I just wanted to point it out

Fly.io uses firecracker so I kinda like firecracker related applications (I tried to run firecracker myself its way too hard to build your own firecracker based provider or anything) and they recently released https://sprites.dev/

E2B is another well known solution out there. I have talked to their developers once and they mentioned that they run it on top of gcp

I am really interested in kata containers as well because I think kata runs on top of firecracker and can hook with docker rather quickly.

▲

amitprasad 12 hours ago | parent | next [-]

If you're not looking for GPU snapshotting the ecosystem is relatively mature. Specifically, CPU-only VM-based snapshotting techniques are pretty well understood. However, if you need GPUs, this is a notoriously hard problem. IIRC Fly also was planning on using gVisor (EDIT: cloud-hypervisor) for their GPU cloud, but abandoned the effort [1].

Kata runs atop many things, but is a little awkward because it creates a "pod" (VM) inside which it creates 1+ containers (runc/gVisor). Firecracker is also awkward because GPU support is pretty hard / impossible.

[1] https://fly.io/blog/wrong-about-gpu/

	▲	Imustaskforhelp 2 hours ago \| parent [-]
		Ohh this makes sense now. Firecracker is good for compute related workflows but gvisor is more good for GPU related workflows, gotcha. For my use cases usually, its Firecracker but I can now see why company like Modal would use gvisor because they focus a lot (and I mean a lot) on providing gpu access. I think that its one of their largest selling points or one of them, for them compute is secondary customer and gvisor's compute performance hit is a well worth trade off for them Thanks for trying to explain the situation!

▲

16 hours ago | parent | prev [-]

[deleted]