Remix.run Logo
xg15 2 days ago

I learned about the concept of async/await from JS and back then was really amazed by the elegance of it.

By now, the downsides are well-known, but I think Python's implementation did a few things that made it particularly unpleasant to use.

There is the usual "colored functions" problem. Python has that too, but on steroids: There are sync and async functions, but then some of the sync functions can only be called from an async function, because they expect an event loop to be present, while others must not be called from an async function because they block the thread or take a lot of CPU to run or just refuse to run if an event loop is detected. That makes at least four colors.

The API has the same complexity: In JS, there are 3 primitives that you interact with in code: Sync functions, async functions and promises. (Understanding the event loop is needed to reason about the program, but it's never visible in the code).

Whereas Python has: Generators, Coroutines, Awaitables, Futures, Tasks, Event Loops, AsyncIterators and probably a few more.

All that for not much benefit in everyday situations. One of the biggest advantages of async/await was "fearless concurrency": The guarantee that your variables can only change at well-defined await points, and can only change "atomically". However, python can't actually give the first guarantee, because threaded code may run in parallel to your async code. The second guarantee already comes for free in all Python code, thanks to the GIL - you don't need async for that.

mcdeltat 2 days ago | parent | next [-]

I think Python async is pretty cool - much nicer than threading or multiprocessing - yet has a few annoying rough edges like you say. Some specific issues I run into every time:

Function colours can get pretty verbose when you want to write functional wrappers. You can end up writing nearly the exact same code twice because one needs to be async to handle an async function argument, even if the real functionality of the wrapper isn't async.

Coroutines vs futures vs tasks are odd. More than is pleasant, you have one but need the other for an API for no intuitive reason. Some waiting functions work on some types and not on others. But you can usually easily convert between them - so why make a distinction in the first place?

I think if you create a task but don't await it (which is plausible in a server type scenario), it's not guaranteed to run because of garbage collection or something. That's weird. Such behaviour should be obviously defined in the API.

tylerhou 2 days ago | parent | next [-]

> You can end up writing nearly the exact same code twice because one needs to be async to handle an async function argument, even if the real functionality of the wrapper isn't async.

Sorry for the possibly naive question. If I need to call a synchronous function from an async function, why can't I just call await on the async argument?

    def foo(bar: str, baz: int):
      # some synchronous work
      pass
    
    async def other(bar: Awaitable[str]):
      foo(await bar, 0)
gcharbonnier 11 hours ago | parent [-]

Nothing and that’s the problem because even though you can do it, your event loop will block until foo has finished executing, meaning that in this thread no other coroutine will be executed in the meantime (an event loop runs in its own thread. Most of the time there is only the main thread thus a single event loop). This defeats the purpose of concurrent programming.

everforward 2 days ago | parent | prev | next [-]

> I think if you create a task but don't await it (which is plausible in a server type scenario), it's not guaranteed to run because of garbage collection or something.

I think that use case doesn't work well in async, because async effectively creates a tree of Promises that resolve in order. A task that doesn't get await-ed is effectively outside it's own tree of Promises because it may outlive the Promise it is a child of.

I think the solution would be something like Linux's zombie process reaping, and I can see how the devs prefer just not running those tasks to dealing with that mess.

xg15 2 days ago | parent [-]

No, Python's system is more complex and unfortunately overloads "await" to do several things.

If you just do

  async def myAsyncFunction():
    ...
    await someOtherAsyncFunction()
    ...
then the call to someOtherAsyncFunction will not spawn any kind of task or delegate to the event loop at all - it will just execute someOtherAsyncFunction() within the task and event loop iteration that myAsyncFunction() is already running in. This is a major difference from JS.

If you just did

  someOtherAsyncFunction()
without await, this would be a fire-and-forget call in JS, but in Python, it doesn't do anything. The statement creates a coroutine object for the someOtherAsyncFunction() call, but doesn't actually execute the call and instead just throws the object away again.

I think this is what triggers the "coroutine is not awaited" warning: It's not complaining about fire-and-forget being bad style, it's warning that your code probably doesn't do what you think it does.

The same pitfall is running things concurrently. In JS, you'd do:

  task1 = asyncFunc1();
  task2 = asyncFunc2();
  await task1;
  await task2;
In Python, the functions will be run sequentially, in the await lines, not in the lines with the function calls.

To actually run things in parallel, you have to to

  loop.create_task(asyncFunc())
or one of the related methods. The method will schedule a new task and return a future that you can await on, but don't have to. But that "await" would work completely differently from the previous awaits internally.
everforward a day ago | parent [-]

I think this is semantically the same thing, though I'm sure your terminology is more correct (not an expert here).

If you do `someOtherAsyncFunction()` without await and Python tried to execute similarly to a version with `await`, then the one without await would happen in the same task and event loop iteration but there's no guarantee that it's done by the time the outer function is. Thus the existing task/event loop iteration has to be kept alive or the non-await'ed task needs to be reaped to some other task/event loop iteration.

> loop.create_task(asyncFunc())

This sort of intuitively makes sense to me because you're creating a new "context" of sorts directly within the event loop. It's similar-ish to creating daemons as children of PID 1 rather than children of more-ephemeral random PIDs.

xg15 a day ago | parent [-]

> but there's no guarantee that it's done by the time the outer function is.

As far as I understood it, calling an async function without await (or create_task()) does not run the function at all - there is no uncertainty involved.

Async functions work sort of like generators in that the () operator just creates a temporary object to store the parameters. The 'await' or create_task() are the things that actually execute the function - the first immediately runs it in the same task as the containing function, the second creates a new task and puts that in the event queue for later execution.

So

  asyncFunc()
without anything else is a no-op. It creates the object for parameter storage ("coroutine object") and then throws it away, but never actually calls (or schedules) asyncFunc.

When queuing the function in a new task with create_task(), then you're right - there is no guarantee the function would finish, or even would have started before the outer function completed. But the new task won't have any relationship to the task of the outer function at all, except if the outer function explicitly chooses to wait for the other task, using the Future object that was returned by create_task.

xg15 2 days ago | parent | prev [-]

I think the general idea of function colors has some merit - when done right, it's a crude way to communicate information about a function's expected runtime in a way that can be enforced by the environment: A sync function is expected to run short enough that it's not user-perceptible, whereas an async function can run for an arbitrary amount of time. In "exchange", you get tools to manage the async function while it runs. If a sync function runs too long (on the event loop) this can be detected and flagged as an error.

Maybe a useful approach for a language would be to make "colors" a first-class part of the type system and support them in generics, etc.

Or go a step further and add full-fledged time complexity tracking to the type system.

munificent 2 days ago | parent | next [-]

> Maybe a useful approach for a language would be to make "colors" a first-class part of the type system and support them in generics, etc.

Rust has been trying to do that with "keyword generics": https://blog.rust-lang.org/inside-rust/2023/02/23/keyword-ge...

lmm 2 days ago | parent | prev [-]

> Maybe a useful approach for a language would be to make "colors" a first-class part of the type system and support them in generics, etc.

This is what languages with higher-kinded types do and it's glorious. In Scala you write your code in terms of a generic monad and then you can reuse it for sync or async.

gloomyday 2 days ago | parent | prev | next [-]

I remember trying to use async in Python for the first time in 2017, and I actually found it easier to learn the basics of Go to create a coroutine, export it as a shared library, and create the bindings. I'm not exaggerating.

If I remember correctly, the Python async API was still in experimental phase at that time.

nateglims 2 days ago | parent | prev | next [-]

The API complexity really threw me when I last tried async python. It's very different from other async systems and is incredibly different from gevent or twisted which were popular when I was last writing server python.

codethief 2 days ago | parent | prev | next [-]

> but then some of the sync functions can only be called from an async function, because they expect an event loop to be present

I agree that that's annoying but tbh it sounds like any other piece of code to me that relies on global state. (Man, I can't wait for algebraic effects to become mainstream…)

Retr0id 2 days ago | parent | prev | next [-]

> some of the sync functions can only be called from an async function, because they expect an event loop to be present

I recognise that this situation is possible, but I don't think I've ever seen it happen. Can you give an example?

xg15 2 days ago | parent [-]

Everything that directly interacts with an event loop object and calls methods such as loop.call_soon() [1].

This is used by most of asyncio's synchronization primitives, e.g. async.Queue.

A consequence is that you cannot use asyncio Queues to pass messages or work items between async functions and worker threads. (And of course you can't use regular blocking queues either, because they would block).

The only solution is to build your own ad-hoc system using loop.call_soon_threadsafe() or use third-party libs like Janus[2].

[1] https://github.com/python/cpython/blob/e4e2390a64593b33d6556...

[2] https://github.com/aio-libs/janus

int_19h 2 days ago | parent | prev [-]

Generators are orthogonal to all this. They are the equivalent of `function*` in JS. And yes, they are also coroutines, but experience has shown that keeping generators separate from generic async functions is more ergonomic (hence why C# and JS both do the same thing).

xg15 2 days ago | parent [-]

True. I think the connection is more a historical one became the first async implementation was done using generators and lots of "yield from" statements AFAIK.

But I think generators are still sometimes mentioned in tutorials for this reason.

int_19h 2 days ago | parent [-]

Implementing what was essentially an equivalent of `await` on top of `yield` (before we got `yield from` even) was a favorite pastime at some point. I worked on a project that did exactly that for WinRT projection to Python. And before that there was Twisted. It's very tempting because it gets you like 90% there. But then eventually you want something like `async for` etc...