I'm curious what this offers over just building the host side code to be native?

My quick guess is that this approach offers near zero overhead for gpu to access data inside sandbox with all the security/privacy benefit of sandbox.

▲

agambrahma a day ago | parent | prev | next [-]

Yes, simply for local inference -- not much, native is the obvious choice.

The value would be in actor processes, where you can delegate inference without paying the 'copy tax' for crossing the sandbox boundary.

So, less "inference engine" and more "Tmux for AI agents"

Think pausing, moving, resuming, swapping model backend.

I scoped the post to memory architecture, since it was the least obvious part ... will follow up with one about the actor model aspect.

	▲	saagarjha a day ago \| parent [-]
		I'm a little confused what an actor process is. To me a process is inherently local?

▲

swiftcoder 2 days ago | parent | prev [-]

For one thing, it's a lot easier to distribute a webpage than a native app

▲

saagarjha 2 days ago | parent [-]

This doesn't work with webpages though

	▲	swiftcoder 2 days ago \| parent [-]
		I somehow missed that tidbit