Zig's new plan for asynchronous programs

▲ Zig's new plan for asynchronous programs(lwn.net)

162 points by messe 8 hours ago | 120 comments

▲ AndyKelley 5 hours ago | parent | next [-]

Overall this article is accurate and well-researched. Thanks to Daroc Alden for due diligence. Here are a couple of minor corrections:

> When using an Io.Threaded instance, the async() function doesn't actually do anything asynchronously — it just runs the provided function right away.

While this is a legal implementation strategy, this is not what std.Io.Threaded does. By default, it will use a configurably sized thread pool to dispatch async tasks. It can, however, be statically initialized with init_single_threaded in which case it does have the behavior described in the article.

The only other issue I spotted is:

> For that use case, the Io interface provides a separate function, asyncConcurrent() that explicitly asks for the provided function to be run in parallel.

There was a brief moment where we had asyncConcurrent() but it has since been renamed more simply to concurrent().

▲

landr0id 2 hours ago | parent [-]

Hey Andrew, question for you about something the article litely touches on but doesn't really discuss further:

> If the programmer uses async() where they should have used asyncConcurrent(), that is a bug. Zig's new model does not (and cannot) prevent programmers from writing incorrect code, so there are still some subtleties to keep in mind when adapting existing Zig code to use the new interface.

What class of bug occurs if the wrong function is called? Is it "UB" depending on the IO model provided, a logic issue, or something else?

	▲	AndyKelley an hour ago \| parent [-]
		A deadlock. For example, the function is called immediately, rather than being run in a separate thread, causing it to block forever on accept(), because the connect() is after the call to async(). If concurrent() is used instead, the I/O implementation will spawn a new thread for the function, so that the accept() is handled by the new thread, or it will return error.ConcurrencyUnavailable. async() is infallible. concurrent() is fallible.

▲ thefaux 3 hours ago | parent | prev | next [-]

This design seems very similar to async in scala except that in scala the execution context is an implicit parameter rather than an explicit parameter. I did not find this api to be significantly better for many use cases than writing threads and communicating over a concurrent queue. There were significant downsides as well because the program behavior was highly dependent on the execution context. It led to spooky action at a distance problems where unrelated tasks could interfere with each and management of the execution context was a pain. My sense though is that the zig team has little experience with scala and thus do not realize the extent to which this is not a novel approach, nor is it a panacea.

	▲	pron 25 minutes ago \| parent [-]
		> I did not find this api to be significantly better for many use cases than writing threads and communicating over a concurrent queue. The problem with using OS threads, you run into scaling problems due to Little's law. On the JVM we can use virtual threads, which don't run into that limitation, but the JVM can implement user-mode threads more efficiently than low-level languages can for several reasons (the JIT can see through all virtual calls, the JVM has helpful restrictions on pointers into the stack, and good GCs make memory management very cheap in exchange for a higher RAM footprint). So if you want scalability, low-level languages need other solutions.

▲ woodruffw 6 hours ago | parent | prev | next [-]

I think this design is very reasonable. However, I find Zig's explanation of it pretty confusing: they've taken pains to emphasize that it solves the function coloring problem, which it doesn't: it pushes I/O into an effect type, which essentially behaves as a token that callers need to retain. This is a form of coloring, albeit one that's much more ergonomic.

(To my understanding this is pretty similar to how Go solves asynchronicity, expect that in Go's case the "token" is managed by the runtime.)

▲ flohofwoe 6 hours ago | parent | next [-]

If calling the same function with a different argument would be considered 'function coloring', every function in a program is 'colored' and the word loses its meaning ;)

Zig actually also had solved the coloring problem in the old and abandondend async-await solution because the compiler simply stamped out a sync- or async-version of the same function based on the calling context (this works because everything is a single compilation unit).

▲

SkiFire13 3 hours ago | parent | next [-]

> Zig actually also had solved the coloring problem in the old and abandondend async-await solution because the compiler simply stamped out a sync- or async-version of the same function based on the calling context (this works because everything is a single compilation unit).

AFAIK this still leaked through function pointers, which were still sync or async (and this was not visible in their type)

▲

woodruffw 6 hours ago | parent | prev | next [-]

> If calling the same function with a different argument would be considered 'function coloring', than every function in a program is 'colored' and the word loses its meaning ;)

Well, yes, but in this case the colors (= effects) are actually important. The implications of passing an effect through a system are nontrivial, which is why some languages choose to promote that effect to syntax (Rust) and others choose to make it a latent invariant (Java, with runtime exceptions). Zig chooses another path not unlike Haskell's IO.

▲

adamwk 5 hours ago | parent | prev | next [-]

The subject of the function coloring article was callback APIs in Node, so an argument you need to pass to your IO functions is very much in the spirit of colored functions and has the same limitations.

▲

jakelazaroff 5 hours ago | parent [-]

In Zig's case you pass the argument whether or not it's asynchronous, though. The caller controls the behavior, not the function being called.

▲

layer8 4 hours ago | parent [-]

The coloring is not the concrete argument (Io implementation) that is passed, but whether the function has an Io parameter in the first place. Whether the implementation of a function performs IO is in principle an implementation detail that can change in the future. A function that doesn't take an Io argument but wants to call another function that requires an Io argument can't. So you end up adding Io parameters just in case, and in turn require all callers to do the same. This is very much like function coloring.

In a language with objects or closures (which Zig doesn't have first-class support for), one flexibility benefit of the Io object approach is that you can move it to object/closure creation and keep the function/method signature free from it. Still, you have to pass it somewhere.

▲

messe 2 hours ago | parent | next [-]

> Whether the implementation of a function performs IO is in principle an implementation detail that can change in the future.

I think that's where your perspective differs from Zig developers.

Performing IO, in my opinion, is categorically not an implementation detail. In the same way that heap allocation is not an implementation detail in idiomatic Zig.

I don't want to find out my math library is caching results on disk, or allocating megabytes to memoize. I want to know what functions I can use in a freestanding environment, or somewhere resource constrained.

▲

derriz 2 hours ago | parent | prev | next [-]

> A function that doesn't take an Io argument but wants to call another function that requires an Io argument can't.

Why? Can’t you just create an instance of an Io of whatever flavor you prefer and use that? Or keep one around for use repeatedly?

The whole “hide a global event loop behind language syntax” is an example of a leaky abstraction which is also restrictive. The approach here is explicit and doesn’t bind functions to hidden global state.

	▲	layer8 an hour ago \| parent [-]
		You can, but then you’re denying your callers control over the Io. It’s not really different with async function coloring: https://news.ycombinator.com/item?id=46126310 Scheduling of IO operations isn’t hidden global state. Or if it is, then so is thread scheduling by the OS.

▲

quantummagic 3 hours ago | parent | prev [-]

Is that a problem in practice though? Zig already has this same situation with its memory allocators; you can't allocate memory unless you take a parameter. Now you'll just have to take a memory allocator AND an additional io object. Doesn't sound very ergonomic to me, but if all Zig code conforms to this scheme, in practice there will only-one-way-to-do-it. So one of the colors will never be needed, or used.

▲

jcranmer 5 hours ago | parent | prev | next [-]

> If calling the same function with a different argument would be considered 'function coloring', than every function in a program is 'colored' and the word loses its meaning ;)

I mean, the concept of "function coloring" in the first place is itself an artificial distinction invented to complain about the incongruent methods of dealing with "do I/O immediately" versus "tell me when the I/O is done"--two methods of I/O that are so very different that it really requires very different designs of your application on top of those I/O methods: in a sync I/O case, I'm going to design my parser to output a DOM because there's little benefit to not doing so; in an async I/O case, I'm instead going to have a streaming API.

I'm still somewhat surprised that "function coloring" has become the default lens to understand the semantics of async, because it's a rather big misdirection from the fundamental tradeoffs of different implementation designs.

▲

rowanG077 6 hours ago | parent | prev [-]

If your functions suddenly requires (currently)unconstructable instance "Magic" which you now have to pass in from somewhere top level, that indeed suffers from the same issue as async/await. Aka function coloring.

But most functions don't. They require some POD or float, string or whatever that can be easily and cheaply constructed in place.

▲ eikenberry 25 minutes ago | parent | prev | next [-]

Function coloring is specifically about requiring syntax for a function, eg. the async keyword. So if you want an async and non-async function you need to write both in code. If you pass the "coloring" as an argument you avoid the need for extra syntax and multiple function definitions and therefor the function has no color. You can solve this in various ways with various tradeoffs but as long as there is a single function (syntactically) is all that matters for coloring.

	▲	IshKebab 8 minutes ago \| parent [-]
		> Function coloring is specifically about requiring syntax for a function, eg. the async keyword. It isn't really. It's about having two classes of functions (async and sync), and not being able to await async functions from sync ones. It was originally about Javascript, where it is the case due to how the runtime works. In a sync function you can technically call an async one, but it returns a promise. There's no way to get the actual result before you return from your sync function. That isn't the case for all languages though. E.g. in Rust: https://docs.rs/futures/latest/futures/executor/fn.block_on.... I think maybe Python can do something similar but don't quote me on that. There's a closely related problem about making functions generic over synchronicity, which people try and solve with effects, monads, etc. Maybe people call that "function colouring" now, but that wasn't exactly the original meaning.

▲ jayd16 6 hours ago | parent | prev | next [-]

Actually it seems like they just colored everything async and you pick whether you have worker threads or not.

I do wonder if there's more magic to it than that because it's not like that isn't trivially possible in other languages. The issue is it's actually a huge foot gun when you mix things like this.

For example your code can run fine synchronously but will deadlock asynchronously because you don't account for methods running in parallel.

Or said another way, some code is thread safe and some code isn't. Coloring actually helps with that.

▲

flohofwoe 6 hours ago | parent [-]

> Actually it seems like they just colored everything async and you pick whether you have worker threads or not.

There is no 'async' anywhere yet in the new Zig IO system (in the sense of the compiler doing the 'state machine code transform' on async functions).

AFAIK the current IO runtimes simply use traditional threads or coroutines with stack switching. Bringing code-transform-async-await back is still on the todo-list.

The basic idea is that the code which calls into IO interface doesn't need to know how the IO runtime implements concurrency. I guess though that the function that's called through the `.async()` wrapper is expected to work properly both in multi- and single-threaded contexts.

▲

jayd16 5 hours ago | parent [-]

> There is no 'async'

I meant this more as simply an analogy to the devX of other languages.

>Bringing code-transform-async-await back is still on the todo-list.

The article makes it seem like "the plan is set" so I do wonder what that Todo looks like. Is this simply the plan for async IO?

> is expected to work properly both in multi- and single-threaded contexts.

Yeah... about that....

I'm also interested in how that will be solved. RTFM? I suppose a convention could be that your public API must be thread safe and if you have a thread-unsafe pattern it must be private? Maybe something else is planned?

	▲	messe 5 hours ago \| parent [-]
		> The article makes it seem like "the plan is set" so I do wonder what that Todo looks like. Is this simply the plan for async IO? There's currently a proposal for stackless coroutines as a language primitive: https://github.com/ziglang/zig/issues/23446

▲ doyougnu 5 hours ago | parent | prev | next [-]

Agreed. the Haskeller in me screams "You've just implemented the IO monad without language support".

▲

AndyKelley 3 hours ago | parent [-]

It's not a monad because it doesn't return a description of how to carry out I/O that is performed by a separate system; it does the I/O inside the function before returning. That's a regular old interface, not a monad.

	▲	endgame 2 hours ago \| parent [-]
		So it's the reader monad, then? ;-)

▲ SkiFire13 3 hours ago | parent | prev | next [-]

The function coloring problem actually comes up when you implement the async part using stackless coroutines (e.g. in Rust) or callbacks (e.g. in Javascript).

Zig's new I/O does neither of those for now, so hence why it doesn't suffer from it, but at the same time it didn't "solve" the problem, it just sidestepped it by providing an implementation that has similar features but not exactly the same tradeoffs.

▲

bloppe 3 hours ago | parent | next [-]

How are the tradeoffs meaningfully different? Imagine that, instead of passing an `Io` object around, you just had to add an `async` keyword to the function, and that was simply syntactic sugar for an implied `Io` argument, and you could use an `await` keyword as syntactic sugar to pass whatever `Io` object the caller has to the callee.

I don't see how that's not the exact same situation.

▲

bevr1337 2 hours ago | parent | next [-]

In the JS example, a synchronous function cannot poll the result of a Promise. This is meaningfully different when implementing loops and streams. Ex, game loop, an animation frame, polling a stream.

A great example is React Suspense. To suspend a component, the render function throws a Promise. To trigger a parent Error Boundary, the render function throws an error. To resume a component, the render function returns a result. React never made the suspense API public because it's a footgun.

If a JS Promise were inspectable, a synchronous render function could poll its result, and suspended components would not need to use throw to try and extend the language.

	▲	bloppe an hour ago \| parent [-]
		I see. I guess JS is the only language with the coloring problem, then, which is strange because it's one of the few with a built-in event loop. This Io business is isomorphic to async/await in Rust or Python [1]. Go also has a built-in "event loop"-type thing, but decidedly does not have a coloring problem. I can't think of any languages besides JS that do. [1]: https://news.ycombinator.com/item?id=46126310

▲

VMG 2 hours ago | parent | prev [-]

Maybe I have this wrong, but I believe the difference is that you can create an Io instance in a function that has none

	▲	bloppe 2 hours ago \| parent [-]
		In Rust, you can always create a new tokio runtime and use that to call an async function from a sync function. Ditto with Python: just create a new asyncio event loop and call `run`. That's actually exactly what an Io object in Zig is, but with a new name. Looking back at the original function coloring post [1], it says: > It is better. I will take async-await over bare callbacks or futures any day of the week. But we’re lying to ourselves if we think all of our troubles are gone. As soon as you start trying to write higher-order functions, or reuse code, you’re right back to realizing color is still there, bleeding all over your codebase. So if this is isomorphic to async/await, it does not "solve" the coloring problem as originally stated, but I'm starting to think it's not much of a problem at all. Some functions just have different signatures from other functions. It was only a huge problem for JavaScript because the ecosystem at large decided to change the type signatures of some giant portion of all functions at once, migrating from callbacks to async. [1]: https://journal.stuffwithstuff.com/2015/02/01/what-color-is-...

▲

zamalek 2 hours ago | parent | prev [-]

It's sans-io at the language level, I like the concept.

So I did a bit of research into how this works in Zig under the hood, in terms of compilation.

First things first, Zig does compile async fns to a state machine: https://github.com/ziglang/zig/issues/23446

The compiler decides at compile time which color to compile the function as (potentially both). That's a neat idea, but... https://github.com/ziglang/zig/issues/23367

> It would be checked illegal behavior to make an indirect call through a pointer to a restricted function type when the value of that pointer is not in the set of possible callees that were analyzed during compilation.

That's... a pretty nasty trade-off. Object safety in Rust is really annoying for async, and this smells a lot like it. The main difference is that it's vaguely late-bound in a magical way; you might get an unexpected runtime error and - even worse - potentially not have the tools to force the compiler to add a fn to the set of callees.

I still think sans-io at the language level might be the future, but this isn't a complete solution. Maybe we should be simply compiling all fns to state machines (with the Rust polling implementation detail, a sans-io interface could be used to make such functions trivially sync - just do the syscall and return a completed future).

	▲	algesten 35 minutes ago \| parent [-]
		I wouldn't define it as Sans-IO if you take an IO argument and block/wait on reading/writing, whether that be via threads or an event loop. Sans-IO the IO is _outside_ completely. No read/write at all.

▲ dundarious 6 hours ago | parent | prev | next [-]

There is a token you must pass around, sure, but because you use the same token for both async and sync code, I think analogizing with the typical async function color problem is incorrect.

▲ rowanG077 6 hours ago | parent | prev [-]

Having used zig a bit as a hobby. Why is it more ergonomic? Using await vs passing a token have similar ergonomics to me. The one thing you could say is that using some kind of token makes it dead simple to have different tokens. But that's really not something I run into often at all when using async.

▲ messe 6 hours ago | parent [-]

> The one thing you could say is that using some kind of token makes it dead simple to have different tokens. But that's really not something I run into often at all when using async.

It's valuable to library authors who can now write code that's agnostic of the users' choice of runtime, while still being able to express that asynchronicity is possible for certain code paths.

▲ rowanG077 6 hours ago | parent [-]

But that can already be done using async await. If you write an async function in Rust for example you are free to call it with any async runtime you want.

▲ messe 6 hours ago | parent [-]

But you can't call it from synchronous rust. Zig is moving toward all sync code also using the Io interface.

▲ tcfhgj 4 hours ago | parent [-]

yes, you can:

    runtime.block_on(async { })

https://play.rust-lang.org/?version=stable&mode=debug&editio...

▲ messe 3 hours ago | parent | next [-]

Let me rephrase, you can't call it like any other function.

In Zig, a function that does IO can be called the same way whether or not it performs async operations or not. And if those async operations don't need concurrency (which Zig expresses separately to asynchronicity), then they'll run equally well on a sync Io runtime.

▲ tcfhgj 3 hours ago | parent [-]

> In Zig, a function that does IO can be called the same way whether or not it performs async operations or not.

no, you can't, you need to pass a IO parameter

▲ messe 3 hours ago | parent [-]

You will need to pass that for synchronous IO as well. All IO in the standard library is moving to the Io interface. Sync and async.

If I want to call a function that does asynchronous IO, I'll use:

   foo(io, ...);

If I want to call one that does synchronous IO, I'll write:

    foo(io, ...);

If I want to express that either one of the above can be run asynchronously if possible, I'll write:

    io.async(foo, .{ io, ... });

If I want to express that it must be run concurrently, then I'll write:

    try io.concurrent(foo, .{ io, ... });

Nowhere in the above do I distinguish whether or not foo does synchronous or asynchronous IO. I only mark that it does IO, by passing in a parameter of type std.Io.

▲

tcfhgj 3 hours ago | parent [-]

what about non-io code?

▲

messe 2 hours ago | parent [-]

What about it? It gets called without an Io parameter. Same way that a function that doesn't allocate doesn't get an allocator.

I feel like you're trying to set me up for a gotcha "see, zig does color functions because it distinguishes functions that do io and those that don't!".

And yes, that's true. Zig, at least Zig code using std, will mark functions that do Io with an Io parameter. But surely you can see how that will lead to less of a split in the ecosystem compared to sync and async rust?

	▲	tcfhgj 2 hours ago \| parent [-]
		> But surely you can see how that will lead to less of a split in the ecosystem compared to sync and async rust? not yet

▲ whytevuhuni 3 hours ago | parent | prev [-]

Here's a problem with that:

    Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.

https://play.rust-lang.org/?version=stable&mode=debug&editio...

▲

tcfhgj 3 hours ago | parent [-]

just pass around handles like you do in zig, alright?

also: spawn_blocking for blocking code

▲

whytevuhuni 2 hours ago | parent [-]

But that's the thing, idiomatic Rust sync code almost never passes around handles, even when they need to do I/O.

You might be different, and you might start doing that in your code, but almost none of either std or 3rd party libraries will cooperate with you.

The difference with Zig is not in its capabilities, but rather in how the ecosystem around its stdlib is built.

The equivalent in Rust would be if almost all I/O functions in std would be async; granted that would be far too expensive and disruptive given how async works.

	▲	tcfhgj a few seconds ago \| parent [-]
		> But that's the thing, idiomatic Rust sync code almost never passes around handles, even when they need to do I/O. Because they don't use async inside. Zig code is passing around handles in code without io?

▲ ethin 5 hours ago | parent | prev | next [-]

One thing the old Zig async/await system theoretically allowed me to do, which I'm not certain how to accomplish with this new io system without manually implementing it myself, is suspend/resume. Where you could suspend the frame of a function and resume it later. I've held off on taking a stab at OS dev in Zig because I was really, really hoping I could take advantage of that neat feature: configure a device or submit a command to a queue, suspend the function that submitted the command, and resume it when an interrupt from the device is received. That was my idea, anyway. Idk if that would play out well in practice, but it was an interesting idea I wanted to try.

▲

nine_k 4 hours ago | parent | next [-]

Can you create a thread pool consisting of one thread, and suspend / resume the thread?

	▲	RossBencina 10 minutes ago \| parent [-]
		Doesn't that negate the point of using coroutines? light-weight concurrency

▲

NooneAtAll3 5 hours ago | parent | prev [-]

what's the point of implementing cooperative "multithreading" (coroutines) with preemptive one (async)?

▲ amluto 6 hours ago | parent | prev | next [-]

I find this example quite interesting:

       var a_future = io.async(saveFile, .{io, data, "saveA.txt"});
        var b_future = io.async(saveFile, .{io, data, "saveB.txt"});

        const a_result = a_future.await(io);
        const b_result = b_future.await(io);

In Rust or Python, if you make a coroutine (by calling an async function, for example), then that coroutine will not generally be guaranteed to make progress unless someone is waiting for it (i.e. polling it as needed). In contrast, if you stick the coroutine in a task, the task gets scheduled by the runtime and makes progress when the runtime is able to schedule it. But creating a task is an explicit operation and can, if the programmer wants, be done in a structured way (often called “structured concurrency”) where tasks are never created outside of some scope that contains them.

From this example, if the example allows the thing that is “io.async”ed to progress all by self, then I guess it’s creating a task that lives until it finishes or is cancelled by getting destroyed.

This is certainly a valid design, but it’s not the direction that other languages seem to be choosing.

▲ jayd16 6 hours ago | parent | next [-]

C# works like this as well, no? In fact C# can (will?) run the async function on the calling thread until a yield is hit.

▲ throwup238 5 hours ago | parent [-]

So do Python and Javascript. I think most languages with async/await also support noop-ing the yield if the future is already resolved. It’s only when you create a new task/promise that stuff is guaranteed to get scheduled instead of possibly running immediately.

▲ amluto 3 hours ago | parent [-]

I can't quite parse what you're saying.

Python works like this:

    import asyncio

    async def sleepy() -> None:
        print('Sleepy started')
        await asyncio.sleep(0.25)
        print('Sleepy resumed once')
        await asyncio.sleep(0.25)
        print('Sleepy resumed and is done!')


    async def main():
        sleepy_future = sleepy()
        print('Started a sleepy')

        await asyncio.sleep(2)
        print('Main woke back up.  Time to await the sleepy.')

        await sleepy_future

    if __name__ == "__main__":
        asyncio.run(main())

Running it does this:

    $ python3 ./silly_async.py
    Started a sleepy
    Main woke back up.  Time to await the sleepy.
    Sleepy started
    Sleepy resumed once
    Sleepy resumed and is done!

So there mere act of creating a coroutine does not cause the runtime to run it. But if you explicitly create a task, it does get run:

    import asyncio

    async def sleepy() -> None:
        print('Sleepy started')
        await asyncio.sleep(0.25)
        print('Sleepy resumed once')
        await asyncio.sleep(0.25)
        print('Sleepy resumed and is done!')


    async def main():
        sleepy_future = sleepy()
        print('Started a sleepy')

        sleepy_task = asyncio.create_task(sleepy_future)
        print('The sleepy future is now in a task')

        await asyncio.sleep(2)
        print('Main woke back up.  Time to await the task.')

        await sleepy_task

    if __name__ == "__main__":
        asyncio.run(main())

    $ python3 ./silly_async.py
    Started a sleepy
    The sleepy future is now in a task
    Sleepy started
    Sleepy resumed once
    Sleepy resumed and is done!
    Main woke back up.  Time to await the task.

I personally like the behavior of coroutines not running unless you tell them to run -- it makes it easier to reason about what code runs when. But I do not particularly like the way that Python obscures the difference between a future-like thing that is a coroutine and a future-like thing that is a task.

▲ nmilo 6 hours ago | parent | prev | next [-]

This is how JS works

▲ messe 6 hours ago | parent | prev [-]

It's not guaranteed in Zig either.

Neither task future is guaranteed to do anything until .await(io) is called on it. Whether it starts immediately (possibly on the same thread), or queued on a thread pool, or yields to an event loop, is entirely dependent on the Io runtime the user chooses.

▲ amluto 5 hours ago | parent [-]

It’s not guaranteed, but, according to the article, that’s how it works in the Evented model:

> When using an Io.Threaded instance, the async() function doesn't actually do anything asynchronously — it just runs the provided function right away. So, with that version of the interface, the function first saves file A and then file B. With an Io.Evented instance, the operations are actually asynchronous, and the program can save both files at once.

Andrew Kelley’s blog (https://andrewkelley.me/post/zig-new-async-io-text-version.h...) discusses io.concurrent, which forces actual concurrency, and it’s distinctly non-structured. It even seems to require the caller to make sure that they don’t mess up and keep a task alive longer than whatever objects the task might reference:

    var producer_task = try io.concurrent(producer, .{
        io, &queue, "never gonna give you up",
    });
    defer producer_task.cancel(io) catch {};

Having personally contemplated this design space a little bit, I think I like Zig’s approach a bit more than I like the corresponding ideas in C and C++, as Zig at least has defer and tries to be somewhat helpful in avoiding the really obvious screwups. But I think I prefer Rust’s approach or an actual GC/ref-counting system (Python, Go, JS, etc) even more: outside of toy examples, it’s fairly common for asynchronous operations to conceptually outlast single function calls, and it’s really really easy to fail to accurately analyze the lifetime of some object, and having the language prevent code from accessing something beyond its lifetime is very, very nice. Both the Rust approach of statically verifying the lifetime and the GC approach of automatically extending the lifetime mostly solve the problem.

But this stuff is brand new in Zig, and I’ve never written Zig code at all, and maybe it will actually work very well.

	▲	messe 4 hours ago \| parent [-]
		Ah, I think we might have been talking over each other. I'm referring to the interface not guaranteeing anything, not the particular implementation. The Io interface itself doesn't guarantee that anything will have started until the call to await returns.

▲ et1337 7 hours ago | parent | prev | next [-]

I’m excited to see how this turns out. I work with Go every day and I think Io corrects a lot of its mistakes. One thing I am curious about is whether there is any plan for channels in Zig. In Go I often wish IO had been implemented via channels. It’s weird that there’s a select keyword in the language, but you can’t use it on sockets.

▲ jerf 6 hours ago | parent | next [-]

Wrapping every IO operation into a channel operation is fairly expensive. You can get an idea of how fast it would work now by just doing it, using a goroutine to feed a series of IO operations to some other goroutine.

It wouldn't be quite as bad as the perennial "I thought Go is fast why is it slow when I spawn a full goroutine and multiple channel operations to add two integers together a hundred million times" question, but it would still be a fairly expensive operation. See also the fact that Go had fairly sensible iteration semantics before the recent iteration support was added by doing a range across a channel... as long as you don't mind running a full channel operation and internal context switch for every single thing being iterated, which in fact quite a lot of us do mind.

(To optimize pure Python, one of the tricks is to ensure that you get the maximum value out of all of the relatively expensive individual operations Python does. For example, it's already handling exceptions on every opcode, so you could win in some cases by using exceptions cleverly to skip running some code selectively. Go channels are similar; they're relatively expensive, on the order of dozens of cycles, so you want to make sure you're getting sufficient value for that. You don't have to go super crazy, they're not like a millisecond per operation or something, but you do want to get value for the cost, by either moving non-trivial amount of work through them or by taking strong advantage of their many-to-many coordination capability. IO often involves moving around small byte slices, even perhaps one byte, and that's not good value for the cost. Moving kilobytes at a time through them is generally pretty decent value but not all IO looks like that and you don't want to write that into the IO spec directly.)

▲ Zambyte 2 hours ago | parent | prev | next [-]

> One thing I am curious about is whether there is any plan for channels in Zig.

The Zig std.Io equivalent of Golang channels is std.Io.Queue[0]. You can do the equivalent of:

    type T interface{}

    fooChan := make(chan T)
    barChan := make(chan T)

    select {
    case foo := <- fooChan:
        // handle foo
    case bar := <- barChan:
        // handle bar
    }

in Zig like:

    const T = void;

    var foo_queue: std.Io.Queue(T) = undefined;
    var bar_queue: std.Io.Queue(T) = undefined;

    var get_foo = io.async(Io.Queue(T).getOne, .{ &foo_queue, io });
    defer get_foo.cancel(io) catch {};

    var get_bar = io.async(Io.Queue(T).getOne, .{ &bar_queue, io });
    defer get_bar.cancel(io) catch {};

    switch (try io.select(.{
        .foo = &get_foo,
        .bar = &get_bar,
    })) {
        .foo => |foo| {
            // handle foo
        },
        .bar => |bar| {
            // handle bar
        },
    }

Obviously not quite as ergonomic, but the trade off of being able to use any IO runtime, and to do this style of concurrency without a runtime garbage collector is really interesting.

[0] https://ziglang.org/documentation/master/std/#std.Io.Queue.

▲ ecshafer 6 hours ago | parent | prev | next [-]

Have you tried Odin? Its a great language thats also a “better C” but takes more Go inspiration than Zig.

	▲	dismalaf 2 hours ago \| parent [-]
		Second vote for Odin but with a small caveat. Odin doesn't (and won't ever according to its creator) implement specific concurrency strategies. No async, coroutines, channels, fibers, etc... The creator sees concurrency strategy (as well as memory management) as something that's higher level than what he wants the language to be. Which is fine by me, but I know lots of people are looking for "killer" features.

▲ osigurdson 6 hours ago | parent | prev | next [-]

At least Go didn't take the dark path of having async / await keywords. In C# that is a real nightmare and necessary to use sync over async anti-patterns unless willing to re-write everything. I'm glad Zig took this "colorless" approach.

▲

rowanG077 6 hours ago | parent [-]

Where do you think the Io parameter comes from? If you change some function to do something async and now suddenly you require an Io instance. I don't see the difference between having to modify the call tree to be async vs modifying the call tree to pass in an Io token.

▲

messe 6 hours ago | parent [-]

Synchronous Io also uses the Io instance now. The coloring is no longer "is it async?" it's "does it perform Io"?

This allows library authors to write their code in a manner that's agnostic to the Io runtime the user chooses, synchronous, threaded, evented with stackful coroutines, evented with stackless coroutines.

	▲	rowanG077 6 hours ago \| parent [-]
		Rust also allows writing async code that is agnostic to the async runtime used. Subsuming async under Io doesn't change much imo.

▲ kbd 6 hours ago | parent | prev [-]

One of the harms Go has done is to make people think its concurrency model is at all special. “Goroutines” are green threads and a “channel” is just a thread-safe queue, which Zig has in its stdlib https://ziglang.org/documentation/master/std/#std.Io.Queue

▲ jerf 6 hours ago | parent | next [-]

A channel is not just a thread-safe queue. It's a thread-safe queue that can be used in a select call. Select is the distinguishing feature, not the queuing. I don't know enough Zig to know whether you can write a bit of code that says "either pull from this queue or that queue when they are ready"; if so, then yes they are an adequate replacement, if not, no they are not.

Of course even if that exact queue is not itself selectable, you can still implement a Go channel with select capabilities in Zig. I'm sure one exists somewhere already. Go doesn't get access to any magic CPU opcodes that nobody else does. And languages (or libraries in languages where that is possible) can implement more capable "select" variants than Go ships with that can select on more types of things (although not necessarily for "free", depending on exactly what is involved). But it is more than a queue, which is also why Go channel operations are a bit to the expensive side, they're implementing more functionality than a simple queue.

▲ kbd 4 hours ago | parent | next [-]

> I don't know enough Zig to know whether you can write a bit of code that says "either pull from this queue or that queue when they are ready"; if so, then yes they are an adequate replacement, if not, no they are not.

Thanks for giving me a reason to peek into how Zig does things now.

Zig has a generic select function[1] that works with futures. As is common, Blub's language feature is Zig's comptime function. Then the io implementation has a select function[2] that "Blocks until one of the futures from the list has a result ready, such that awaiting it will not block. Returns that index." and the generic select switches on that and returns the result. Details unclear tho.

[1] https://ziglang.org/documentation/master/std/#std.Io.select

[2] https://ziglang.org/documentation/master/std/#std.Io.VTable

▲

jerf 3 hours ago | parent | next [-]

Getting a simple future from multiple queues and then waiting for the first one is not a match for Go channel semantics. If you do a select on three channels, you will receive a result from one of them, but you don't get any future claim on the other two channels. Other goroutines could pick them up. And if another goroutine does get something from those channels, that is a guaranteed one-time communication and the original goroutine now can not get access to that value; the future does not "resolve".

Channel semantics don't match futures semantics. As the name implies, channels are streams, futures are a single future value that may or may not have resolved yet.

Again, I'm sure nothing stops Zig from implementing Go channels in half-a-dozen different ways, but it's definitely not as easy as "oh just wrap a future around the .get of a threaded queue".

By a similar argument it should be observed that channels don't naively implement futures either. It's fairly easy to make a future out of a channel and a couple of simple methods; I think I see about 1 library a month going by that "implements futures" in Go. But it's something that has to be done because channels aren't futures and futures aren't channels.

(Note that I'm not making any arguments about whether one or the other is better. I think such arguments are actually quite difficult because while both are quite different in practice, they also both fairly fully cover the solution space and it isn't clear to me there's globally an advantage to one or the other. But they are certainly different.)

	▲	kbd 2 hours ago \| parent [-]
		> channels aren't futures and futures aren't channels. In my mind a queue.getOne ~= a <- on a Go channel. Idk how you wrap the getOne call in a Future to hand it to Zig's select but that seems like it would be a straightforward pattern once this is all done. I really do appreciate you being strict about the semantics. Tbh the biggest thing I feel fuzzy on in all this is how go/zig actually go about finding the first completed future in a select, but other than that am I missing something? https://ziglang.org/documentation/master/std/#std.Io.Queue.g...

▲

SkiFire13 3 hours ago | parent | prev [-]

Maybe I'm missing something, but how do you get a `Future` for receiving from a channel?

Even better, how would I write my own `Future` in a way that supports this `select` and is compatible with any reasonable `Io` implementation?

▲ jeffbee 5 hours ago | parent | prev [-]

If we're just arguing about the true nature of Scotsmen, isn't "select a channel" merely a convenience around awaiting a condition?

	▲	jerf 3 hours ago \| parent \| next [-]
		This is not a "true Scotsman" argument. It's the distinctive characteristic of Go channels. Threaded queues where you can call ".get()" from another thread, but that operation is blocking and you can't try any other queues, then you can't write: `select { case result := <-resultChan: // whatever case <-cxt.Done(): // our context either timed out or was cancelled }` or any more elaborate structure. Or, to put it a different way, when someone says "I implement Go channels in X Language" I don't look for whether they have a threaded queue but whether they have a select equivalent. Odds are that there's already a dozen "threaded queues" in X Language anyhow, but select is less common. Again note the difference between the word "distinctive" and "unique". No individual feature of Go is unique, of course, because again, Go does not have special unique access to Go CPU opcodes that no one else can use. It's the more defining characteristic compared to the more mundane and normal threaded queue. Of course you can implement this a number of ways. It is not equivalent to a naive condition wait, but probably with enough work you could implement them more or less with a condition, possibly with some additional compiler assistance to make it easier to use, since you'd need to be combining several together in some manner.
	▲	SkiFire13 3 hours ago \| parent \| prev [-]
		It's more akin to awaiting any condition from a list.

▲ 0x696C6961 6 hours ago | parent | prev | next [-]

What other mainstream languages have pre-emptive green threads without function coloring? I can only think of Erlang.

	▲	smw 6 hours ago \| parent \| next [-]
		I'm told modern Java (loom?) does. But I think that might be an exhaustive list, sadly.
	▲	femiagbabiaka 6 hours ago \| parent \| prev [-]
		Maybe not mainstream, but Racket.

▲ dlisboa 5 hours ago | parent | prev [-]

It was special. CSP wasn't anywhere near the common vocabulary back in 2009. Channels provide a different way of handling synchronization.

Everything is "just another thing" if you ignore the advantage of abstraction.

▲ badmonster 3 hours ago | parent | prev | next [-]

Interesting to see Zig tackle async. The io_uring-first approach makes sense for modern systems, but the challenge is always making async ergonomic without sacrificing Zig's explicit control philosophy. Curious how colored functions will play out in practice.

▲ breatheoften an hour ago | parent | prev | next [-]

Is there any way to implement structured concurrency on top of the std.Io primitive?

	▲	AndyKelley 27 minutes ago \| parent [-]
		`var group: Io.Group = .init; defer group.cancel(io);` If you see this pattern, you are doing structured concurrency. Same thing with: `var future = io.async(foo, .{}); defer future.cancel(io);`

▲ mono442 4 hours ago | parent | prev | next [-]

It look like promising idea, though I'm a bit spectical that they can actually make it work with other executors like for example stackless coroutines transparently and it probably won't work with code that uses ffi anyway.

▲ qudat 7 hours ago | parent | prev | next [-]

I'm excited to see where this goes. I recently did some io_uring work in zig and it was a pain to get right.

Although, it does seem like dependency injection is becoming a popular trend in zig, first with Allocator and now with Io. I wonder if a dependency injection framework within the std could reduce the amount of boilerplate all of our functions will now require. Every struct or bare fn now needs (2) fields/parameters by default.

▲ messe 6 hours ago | parent | next [-]

> Every struct or bare fn now needs (2) fields/parameters by default.

Storing interfaces a field in structs is becoming a bit of an an anti-pattern in Zig. There are still use cases for it, but you should think twice about it being your go-to strategy. There's been a recent shift in the standard library toward "unmanaged" containers, which don't store a copy of the Allocator interface, and instead Allocators are passed to any member function that allocates.

Previously, one would write:

    var list: std.ArrayList(u32) = .init(allocator);
    defer list.deinit();
    for (0..count) |i| {
        try list.append(i);
    }

Now, it's:

    var list: std.ArrayList(u32) = .empty;
    defer list.deinit(allocator);
    for (0..count) |i| {
        try list.append(allocator, i);
    }

Or better yet:

    var list: std.ArrayList(u32) = .empty;
    defer list.deinit(allocator);
    try list.ensureUnusedCapacity(allocator, count); // Allocate up front
    for (0..count) |i| {
        list.appendAssumeCapacity(i); // No try or allocator necessary here
    }

▲

turtletontine 2 hours ago | parent [-]

I’m not sure I see how each example improves on the previous (though granted, I don’t really know Zig).

What happens if you call append() with two different allocators? Or if you deinit() with a different allocator than the one that actually handled the memory?

	▲	messe an hour ago \| parent [-]
		Storing an Allocator alongside the container is an additional 16-bytes. This isn't much, but starts adding up when you start storing other objects that keep allocators inside of those containers. This can improve cache locality. It also helps devirtualization, as the most common case is threading a single allocator through your application (with the occasion Arena allocator wrapping it for grouped allocations). When the Allocator interface is stored in the container, it's harder for the optimizer to prove it hasn't changed. > What happens if you call append() with two different allocators? Or if you deinit() with a different allocator than the one that actually handled the memory? It's undefined behaviour, but I've never seen it be an issue in practice. Expanding on what I mentioned above, it's typical for only a single allocator to be used for long live objects throughout the entire program. Arena allocators are used for grouped allocations, and tend to have a well defined scope, so it's obvious where deallocation occurs. FixedBufferAllocator also tends to be used in the same limited scope.

▲ scuff3d 4 hours ago | parent | prev | next [-]

I think a good compromise between a DI framework and having to pass everything individually would be some kind of Context object. It could be created to hold an Allocator, IO implementation, and maybe a Diagnostics struct since Zig doesn't like attaching additional information to errors. Then the whole Context struct or parts of it could be passed around as needed.

▲ Mond_ 6 hours ago | parent | prev | next [-]

Yes, and it's good that way.

Please, anything but a dependency injection framework. All parameters and dependencies should be explicit.

▲ SvenL 7 hours ago | parent | prev [-]

I think and hope that they don’t do that. As far as I remember their mantra was „no magic, you can see everything which is happening“. They wanted to be a simple and obvious language.

	▲	qudat 6 hours ago \| parent [-]
		That's fair, but the same argument can be made for Go's verbose error handling. In that case we could argue that `try` is magical, although I don't think anyone would want to take that away.

▲ dylanowen 6 hours ago | parent | prev | next [-]

This seems a lot like what the scala libraries Zio or Kyo are doing for concurrency, just without the functional effect part.

▲ codr7 6 hours ago | parent | prev | next [-]

Love it, async code is a major pita in most languages.

▲

giancarlostoro 6 hours ago | parent [-]

When Microsoft added Tasks / Async Await, that was when I finally stopped writing single threaded code as often as I did, since the mental overhead drastically went away. Python 3 as well.

	▲	codr7 5 hours ago \| parent [-]
		Isn't this exactly the mess Zig is trying to get out of here? Every other example I've seen encodes the execution model in the source code.

▲ Ericson2314 5 hours ago | parent | prev | next [-]

This is a bad explanation because it doesn't explain how the concurrency actually works. Is it based on stacks? Is there a heavy runtime? Is it stackless and everything is compiled twice?

IMO every low level language's async thing is terrible and half-baked, and I hate that this sort of rushed job is now considered de rigueur.

(IMO We need a language that makes the call stack just another explicit data structure, like assembly and has linearity, "existential lifetimes", locations that change type over the control flow, to approach the question. No language is very close.)

▲ debugnik 6 hours ago | parent | prev | next [-]

> Languages that don't make a syntactical distinction (such as Haskell) essentially solve the problem by making everything asynchronous

What the heck did I just read. I can only guess they confused Haskell for OCaml or something; the former is notorious for requiring that all I/O is represented as values of some type encoding the full I/O computation. There's still coloring since you can't hide it, only promote it to a more general colour.

Plus, isn't Go the go-to example of this model nowadays?

▲

gf000 6 hours ago | parent [-]

Haskell has green threads. Plus nowadays Java also has virtual threads.

▲

debugnik 6 hours ago | parent [-]

And I bet those green threads still need an IO type of some sort to encode anything non-pure, plus usually do-syntax. Comparing merely concurrent computations to I/O-async is just weird. In fact, I suspect that even those green threads already have a "colourful" type, although I can't check right now.

	▲	iviv 3 hours ago \| parent [-]
		Pure actions can be run in parallel with https://hackage-content.haskell.org/package/parallel/docs/Co... Impure actions use the IO monad like always in Haskell: https://hackage.haskell.org/package/base-4.21.0.0/docs/Contr... (or the higher-level async library) I suppose an extreme version of the function coloring argument could be that all types are colors.

▲ cies 6 hours ago | parent | prev | next [-]

I like Zig and I like their approach in this case.

From the article:

    std.Io.Threaded - based on a thread pool.

      -fno-single-threaded - supports concurrency and cancellation.
      -fsingle-threaded - does not support concurrency or cancellation.

    std.Io.Evented - work-in-progress [...]

Should `std.Io.Threaded` not be split into `std.Io.Threaded` and `std.Io.Sequential` instead? Single threaded is another word for "not threaded", or am I wrong here?

▲ ecshafer 7 hours ago | parent | prev | next [-]

I like the look of this direction. I am not a fan of the `async` keyword that has become so popular in some languages that then pollutes the codebase.

▲ davidkunz 7 hours ago | parent | next [-]

In JavaScript, I love the `async` keyword as it's a good indicator that something goes over the wire.

▲ Dwedit 5 hours ago | parent | prev | next [-]

Async always confused me as to when a function would actually create a new thread or not.

▲ warmwaffles 7 hours ago | parent | prev [-]

Async usually ends up being a coloring function that knows no bounds once it is used.

▲ amonroe805-2 7 hours ago | parent [-]

I’ve never really understood the issue with this. I find it quite useful to know what functions may do something async vs which ones are guaranteed to run without stopping.

In my current job, I mostly write (non-async) python, and I find it to be a performance footgun that you cannot trivially tell when a method call will trigger I/O, which makes it incredibly easy for our devs to end up with N+1-style queries without realizing it.

With async/await, devs are always forced into awareness of where these operations do and don’t occur, and are much more likely to manage them effectively.

FWIW: The zig approach also seems great here, as the explicit Io function argument seems likely to force a similar acknowledgement from the developer. And without introducing new syntax at that! Am excited to see how well it works in practice.

▲ newpavlov 7 hours ago | parent | next [-]

In my (Rust-colored) opinion, the async keyword has two main problems:

1) It tracks code property which is usually omitted in sync code (i.e. most languages do not mark functions with "does IO"). Why IO is more important than "may panic", "uses bounded stack", "may perform allocations", etc.?

2) It implements an ad-hoc problem-specific effect system with various warts. And working around those warts requires re-implementation of half of the language.

▲ echelon 6 hours ago | parent [-]

> Why IO is more important than "may panic", "uses bounded stack", "may perform allocations", etc.?

Rust could use these markers as well.

▲ newpavlov 6 hours ago | parent [-]

I agree. But it should be done with a proper effect system, not a pile of ad hoc hacks built on abuse of the type system.

▲ echelon 4 hours ago | parent [-]

`async` is in the type system. In your mind, how would you mark and bubble up panicky functions, etc.? What would that look like?

I felt like a `panic` label for functions would be nice, but if we start stacking labels it becomes cumbersome:

  pub async panic alloc fn foo() {}

That feels dense.

I think ideally it would be something readers could spot at first glance, not something inferred.

	▲	newpavlov an hour ago \| parent [-]
		>`async` is in the type system. No, it's not. `async` is just syntax sugar, the "effect" gets emulated in the type system using `Future`. This is one of the reasons why the `async` system feels so foreign and requires so many language changes to make it remotely usable. `const` is much closer to a "true" effect (well, to be precise it's an anti-effect, but it's not important right now). Also, I think it's useful to distinguish between effect and type systems, instead of lumping them into just "type system". The former applies to code and the latter to data. >That feels dense. Yes. But why `async` is more important than `alloc`? For some applications it's as important to know about potential allocations, as for other applications to know about whether code potentially yields or not. Explicitly listing all effects would the most straightforward approach, but I think a more practical approach would be to have a list of "default effects", which can be overwritten on the crate (or maybe even module) level. And on the function level you will be able to opt-in or opt-out from effects if needed. >I think ideally it would be something readers could spot at first glance Well, you either can have "dense" or explicit "at first glance" signatures.

▲ ecshafer 7 hours ago | parent | prev | next [-]

Is this Django? I could maybe see that argument there. Some frameworks and ORMs can muddy that distinction. But most the code ive written its really clear if something will lead to io or not.

▲ warmwaffles 5 hours ago | parent | prev [-]

I've watched many changes over time where the non async function uses an async call, then the function eventually becomes marked as async. Once majority of functions get marked as async, what was the point of that boilerplate?

▲ LunicLynx 5 hours ago | parent | prev [-]

Pro tip: use postfix keyword notation.

Eg.

doSomethingAsync().defer

This removes stupid parentheses because of precedence rules.

Biggest issue with async/await in other languages.