Remix.run Logo
PaulHoule 2 days ago

I went through a phase of writing asyncio servers for my side projects. Probably the most fun I had was writing things that were responsive in complex ways, such as a websockets server that was also listening on message queues or on a TCP connection to a Denon HEOS music player.

Eventually I wrote an "image sorter" that I found was hanging up when the browser was trying to download images in parallel, the image serving should not have been CPU bound, I was even using sendfile(), but I think other requests would hold up the CPU and would be block the tiny amount of CPU needed to set up that sendfile.

So I switched from aiohttp to the flask API and serve with either Flask or Gunicorn, I even front it with Microsoft IIS or nginx to handle the images so Python doesn't have to. It is a minor hassle because I develop on Windows so I have to run Gunicorn inside WSL2 but it works great and I don't have to think about server performance anymore.

tdumitrescu 2 days ago | parent | next [-]

That's the main problem with evented servers in general isn't it? If any one of your workloads is cpu-intensive, it has the potential to block the serving of everything else on the same thread, so requests that should always be snappy can end up taking randomly long times in practice. Basically if you have any cpu-heavy work, it shouldn't go in that same server.

acdha 2 days ago | parent | next [-]

Indeed. async is one of those things which makes a big difference in a handful of scenarios but which got promoted as a best-practice for everything. Python developers have simply joined Node and Go developers in learning that it’s not magic “go faster” spray and reasoning about things like peak memory load or shared resource management can be harder.

PaulHoule 2 days ago | parent | prev | next [-]

My system is written in Python because it is supported by a number of batch jobs that use code from SBERT, scikit-learn, numpy and such. Currently the server doesn't do any complex calculations but under asyncio it was a strict no-no. Mostly it does database queries and formats HTML responses but it seems like that is still too much CPU.

My take on gunicorn is that it doesn't need any tuning or care to handle anything up to the large workgroup size other than maybe "buy some more RAM" -- and now if I want to do some inference in the server or use pandas to generate a report I can do it.

If I had to go bigger I probably wouldn't be using Python in the server and would have to face up to either dual language or doing the ML work in a different way. I'm a little intimidated about being on the public web in 2025 though with all the bad webcrawlers. Young 'uns just never learned everything that webcrawler authors knew in 1999. In 2010 there were just two bad Chinese webcrawlers that never sent a lick of traffic to anglophone sites, but now there are new bad webcrawlers every day it seems.

nly 2 days ago | parent | prev | next [-]

OS threads are for CPU bound work.

Async is for juggling lots of little initialisations, completions, and coordinating work.

Many apps are best single threaded with a thread pool to run (single threaded) long running tasks.

materielle 2 days ago | parent | prev | next [-]

Traditionally, there are two strategies:

1) Use the network thread pool to also run application code. Then your entire program has to be super careful to not block or do CPU intensive work. This is efficient but leads to difficult to maintain programs.

2) The network thread pool passes work back and forth between an application executor. That way, the network thread pool is never starved by the application, since it is essentially two different work queues. This works great, but now every request performs multiple thread hops, which increases latency.

There has been a lot of interest lately to combine scheduling and work stealing algorithms to create a best of both worlds executor.

You could imagine, theoretically, an executor that auto-scales, and maintains different work queues and tries to avoid thread hops when possible. But ensures there are always threads available for the network.

guappa 2 days ago | parent | prev [-]

Backend developers finding out why user interfaces have a thread for the GUI and a thread for doing work :D

Townley 2 days ago | parent | prev [-]

It’s heartening that there are people who find the problem you described “fun”

Writing a FastAPI websocket that reads from a redis pubsub is a documentation-less flailfest