Remix.run Logo
prngl 4 days ago

This is cool.

There's an interesting parallel with ML compilation libraries (TensorFlow 1, JAX jit, PyTorch compile) where a tracing approach is taken to build up a graph of operations that are then essentially compiled (or otherwise lowered and executed by a specialized VM). We're often nowadays working in dynamic languages, so they become essentially the frontend to new DSLs, and instead of defining new syntax, we embed the AST construction into the scripting language.

For ML, we're delaying the execution of GPU/linalg kernels so that we can fuse them. For RPC, we're delaying the execution of network requests so that we can fuse them.

Of course, compiled languages themselves delay the execution of ops (add/mul/load/store/etc) so that we can fuse them, i.e. skip over the round-trip of the interpreter/VM loop.

The power of code as data in various guises.

Another angle on this is the importance of separating control plane (i.e. instructions) from data plane in distributed systems, which is any system where you can observe a "delay". When you zoom into a single CPU, it acknowledges its nature as a distributed system with memory far away by separating out the instruction pipeline and instruction cache from the data. In Cap'n Web, we've got the instructions as the RPC graph being built up.

I just thought these were some interesting patterns. I'm not sure I yet see all the way down to the bottom though. Feels like we go in circles, or rather, the stack is replicated (compiler built on interpreter built on compiler built on interpreter ...). In some respect this is the typical Lispy code is data, data is code, but I dunno, feels like there's something here to cut through...

ryanrasti 4 days ago | parent | next [-]

Agree -- I think that's a powerful generalization you're making.

> We're often nowadays working in dynamic languages, so they become essentially the frontend to new DSLs, and instead of defining new syntax, we embed the AST construction into the scripting language.

And I'd say that TypeScript is the real game-changer here. You get the flexibility of the JavaScript runtime (e.g., how Cap'n Web cleverly uses `Proxy`s) while still being able to provide static types for the embedded DSL you're creating. It’s the best of both worlds.

I've been spending all of my time in the ORM-analog here. Most ORMs are severely lacking on composability because they're fundamentally imperative and eager. A call like `db.orders.findAll()` executes immediately and you're stuck without a way to add operations before it hits the database.

A truly composable ORM should act like the compilers you mentioned: use TypeScript to define a fully typed DSL over the entirety of SQL, build an AST from the query, and then only at the end compile the graph into the final SQL query. That's the core idea I'm working on with my project, Typegres.

If you find the pattern interesting: https://typegres.com/play/

prngl 4 days ago | parent [-]

I do find the pattern interesting and powerful.

But at the same time, something feels off about it (just conceptually, not trying to knock your money-making endeavor, godspeed). Some of the issues that all of these hit is:

- No printf debugging. Sometimes you want things to be eager so you can immediately see what's happening. If you print and what you see is <RPCResultTracingObject> that's not very helpful. But that's what you'll get when you're in a "tracing" context, i.e. you're treating the code as data at that point, so you just see the code as data. One way of getting around this is to make the tracing completely lazy, so no tracing context at all, but instead you just chain as you go, and something like `print(thing)` or `thing.execute()` actually then ships everything off. This seems like how much of Cap'n Web works except for the part where they embed the DSL, and then you're in a fundamentally different context.

- No "natural" control flow in the DSL/tracing context. You have to use special if/while/for/etc so that the object/context "sees" them. Though that's only the case if the control flow is data-dependent; if it's based on config values that's fine, as long as the context builder is aware.

- No side effects in the DSL/tracing context because that's not a real "running" context, it's only run once to build the AST and then never run again.

Of the various flavors of this I've seen, it's the ML usage I think that's pushed it the furthest out of necessity (for example, jax.jit https://docs.jax.dev/en/latest/_autosummary/jax.jit.html, note the "static*" arguments).

Is this all just necessary complexity? Or is it because we're missing something, not quite seeing it right?

porridgeraisin 4 days ago | parent | next [-]

I think this kind of tracing-caused complexity only arises when the language doesn't let you easily represent and manipulate code as data, or when the language doesn't have static type information.

Python does let you mess around with the AST, however, there is no static typing, and let's just say that the ML ecosystem will <witty example of extreme act> before they adopt static typing. So it's not possible to build these graphs without doing this kind of hacky nonsense.

For another example, torch.compile() works at the python bytecode level. It basically monkey patches the PyEval_EvalFrame function evaluator of Cpython for all torch.compile decorated functions. Inside that, it will check for any operators e.g BINARY_MULTIPLY involving torch tensors, and it records that. Any if conditions in the path get translated to guards in the resulting graph. Later, when said guard fails, it recomputes the subgraph with the complementary condition (and any additional conditions) and stores this as an alternative JIT path, and muxes these in the future depending on the two guards in place now.

Jax works by making the function arguments proxies and recording the operations like you mentioned. However, you cannot use normal `if`, you use lax.cond(), lax.while(), etc,. As a result, it doesn't recompute graph when different branches are encountered, it only computes the graph once.

In a language such as C#, Rust, or a statically typed lisp, you wouldn't need to do any of this monkey business. There's probably already a way in the rust toolchain to interject at the MIR stage and have your own backend convert these to some Tensor IR.

prngl 4 days ago | parent [-]

Yes being able to have compilers as libraries inline in the same code and same language. That feels like what all these call for. Which really is the Lisp core I suppose. But with static types and heterogenous backends. MLIR I think hoped (hopes?) to be something like this but while C++ may be pragmatic it’s not elegant.

Maybe totally off but would dependent types be needed here? The runtime value of one “language” dictates the code of another. So you have some runtime compilation. Seems like dependent types may be the language of jit-compiled code.

Anyways, heady thoughts spurred by a most pragmatic of libraries. Cloudflare wants to sell more schlock to the javascripters and we continue our descent into madness. Einsteins building AI connected SaaS refrigerators. And yet there is beauty still within.

ryanrasti 3 days ago | parent | prev [-]

Really nice summary of the core challenges with this DSL/code-as-data pattern.

I've spent a lot of time thinking about this in the database context:

> No printf debugging

Yeah, spot on. The solutions here would be something like a `toSQL` that let's you inspect the compiled output at any step in the AST construction.

Also, if the backend supports it, you could compile a `printf` function all the way to the backend (this isn't supported in SQL though)

> No "natural" control flow in the DSL/tracing context

Agreed -- that can be a source of confusion and subtle bugs.

You could have a build rule that actually compile `if`/`while`/`for` into your AST (instead of evaluate them in the frontend DSL). Or you could have custom lint rules to forbid them in the DSL.

At the same time -- part of what makes query builders so powerful is the ability to dynamically construct queries. Runtime conditionals is what makes that possible.

> No side effects in the DSL/tracing context because that's not a real "running" context

Agreed -- similar to the above: this is something that needs to be forbidden (e.g., by a lint rule) or clearly understood before using it.

> Is this all just necessary complexity? Or is it because we're missing something, not quite seeing it right?

My take is that, at least in the SQL case: 100% the complexity is justified.

Big reasons why: 1. A *huge* impediment to productive engineering is context switching. A DSL in the same language as your app (i.e., an ORM) makes the bridge to your application code also seamless. (This is similar to the argument of having your entire stack be a single language) 2. The additional layer of indirection (building an AST) allows you to dynamically construct expressions in a way that isn't possible in SQL. This is effectively adding a (very useful) macro system on top of SQL. 3. In the case of Typescript, because its type-system is so flexible, you can have stronger typing on your DSL than the backend target.

tl;dr is these DSLs can enable better ergonomics in practice and the indirection can unlock powerful new primitives

ignoramous 4 days ago | parent | prev [-]

> I just thought these were some interesting patterns.

Reading this from TFA ...

  Alice and Bob each maintain some state about the connection. In particular, each maintains an "export table", describing all the pass-by-reference objects they have exposed to the other side, and an "import table", describing the references they have received. 

  Alice's exports correspond to Bob's imports, and vice versa. Each entry in the export table has a signed integer ID, which is used to reference it. You can think of these IDs like file descriptors in a POSIX system. Unlike file descriptors, though, IDs can be negative, and an ID is never reused over the lifetime of a connection.

  At the start of the connection, Alice and Bob each populate their export tables with a single entry, numbered zero, representing their "main" interfaces.

  Typically, when one side is acting as the "server", they will export their main public RPC interface as ID zero, whereas the "client" will export an empty interface. However, this is up to the application: either side can export whatever they want.
... sounds very similar to how Binder IPC (and soon RPC) works on Android.