Remix.run Logo
tdumitrescu 2 days ago

That's the main problem with evented servers in general isn't it? If any one of your workloads is cpu-intensive, it has the potential to block the serving of everything else on the same thread, so requests that should always be snappy can end up taking randomly long times in practice. Basically if you have any cpu-heavy work, it shouldn't go in that same server.

acdha 2 days ago | parent | next [-]

Indeed. async is one of those things which makes a big difference in a handful of scenarios but which got promoted as a best-practice for everything. Python developers have simply joined Node and Go developers in learning that it’s not magic “go faster” spray and reasoning about things like peak memory load or shared resource management can be harder.

PaulHoule 2 days ago | parent | prev | next [-]

My system is written in Python because it is supported by a number of batch jobs that use code from SBERT, scikit-learn, numpy and such. Currently the server doesn't do any complex calculations but under asyncio it was a strict no-no. Mostly it does database queries and formats HTML responses but it seems like that is still too much CPU.

My take on gunicorn is that it doesn't need any tuning or care to handle anything up to the large workgroup size other than maybe "buy some more RAM" -- and now if I want to do some inference in the server or use pandas to generate a report I can do it.

If I had to go bigger I probably wouldn't be using Python in the server and would have to face up to either dual language or doing the ML work in a different way. I'm a little intimidated about being on the public web in 2025 though with all the bad webcrawlers. Young 'uns just never learned everything that webcrawler authors knew in 1999. In 2010 there were just two bad Chinese webcrawlers that never sent a lick of traffic to anglophone sites, but now there are new bad webcrawlers every day it seems.

nly 2 days ago | parent | prev | next [-]

OS threads are for CPU bound work.

Async is for juggling lots of little initialisations, completions, and coordinating work.

Many apps are best single threaded with a thread pool to run (single threaded) long running tasks.

materielle 2 days ago | parent | prev | next [-]

Traditionally, there are two strategies:

1) Use the network thread pool to also run application code. Then your entire program has to be super careful to not block or do CPU intensive work. This is efficient but leads to difficult to maintain programs.

2) The network thread pool passes work back and forth between an application executor. That way, the network thread pool is never starved by the application, since it is essentially two different work queues. This works great, but now every request performs multiple thread hops, which increases latency.

There has been a lot of interest lately to combine scheduling and work stealing algorithms to create a best of both worlds executor.

You could imagine, theoretically, an executor that auto-scales, and maintains different work queues and tries to avoid thread hops when possible. But ensures there are always threads available for the network.

guappa 2 days ago | parent | prev [-]

Backend developers finding out why user interfaces have a thread for the GUI and a thread for doing work :D