Remix.run Logo
vel0city 11 hours ago

Isn't this kind of the reason why teams will tend to put database proxies in front of their postgres instances, to handle massive sudden influxes of potentially short lived connections?

This sounds exactly like the problem tools like pgbouncer were designed to solve. If you're on AWS one could look at RDS Proxy.

evanelias 11 hours ago | parent | next [-]

Also check out ProxySQL [1][2], it's an extremely powerful and battle-tested proxy. Originally it was only for MySQL/MariaDB, where it is very widely used at scale, even despite MySQL already having excellent built-in scalable threaded connection management. But ProxySQL also added Postgres support too in 2024 and that has become a major focus.

[1] https://proxysql.com/

[2] https://github.com/sysown/proxysql

darkwater 7 hours ago | parent | next [-]

+1 to ProxySQL, especially in RDS environments with huge monoliths attached that open a shitload of threads. RDS has fixed max_connections depending on the instance size so if you don't want to pay $$$$ for bigger but underused instances - and while you are trying to get the devs update all the hundreds old dependencies in the monolith to improve it, ProxySQL - can save your day. It did, for me. And yes, it's a self-managed system but it's pretty easy to operate and very stable.

wreath 2 hours ago | parent [-]

You can change the max_connections in RDS though. The default is insanely high and I have no idea what it is that way. 4vCPU instances running with 5k max connections iirc, I have never seen an instance this size handle more than 100-200 concurrent connections on a CPU bound workload.

sroussey 11 hours ago | parent | prev [-]

And lets you rewrite queries on the fly. :)

pmontra 10 hours ago | parent | prev | next [-]

The article is very well written but is somewhat lacking at the end.

The conclusion lists pgbouncer as one of the solutions but it does not explain it clearly.

> Many pieces of wisdom in the engineering zeitgeist are well preached but poorly understood. Postgres connection pooling falls neatly into this category. In this expedition we found one of the underlying reasons that connection pooling is so widely deployed on postgres systems running at scale. [...] an artificial constraint that has warped the shape of the developer ecosystem (RDS Proxy, pgbouncer, pgcat, etc) around it.

The artificial constraint is the single core nature of postmaster.

Other points at the end of the article that can be improved:

> we can mechnically reason about a solution.

Mechanically as in letting an AI find a solution, or as in reasoning like a mechanic, or? Furthermore:

> * Implementing jitter in our fleet of EC2 instances reduced the peak connection rate

How? Did they wait a random amount of milliseconds before sending queries to the db?

> * Eliminating bursts of parallel queries from our API servers

How?

tbrownaw 10 hours ago | parent [-]

> Mechanically as in letting an AI find a solution, or as in reasoning like a mechanic, or?

As in it's fully characterized, so you can use only math and logic rather than relying on experience and guesswork.

solatic 6 hours ago | parent | prev [-]

PgBouncer introduces its own problems and strictly speaking adds additional complexity to your infrastructure. It needs to be monitored and scaled separately, not to mention the different modes of session/transaction/statement connection pooling. Adding another proxy in the middle also increases latency.

Yes, currently, putting PgBouncer in the middle helps handle massive sudden influxes of potentially short lived connections. That is indeed the correct current best practice. But I hardly fault the author for wishing that postmaster could be configured to run on multiple cores so that the additional complexity of running PgBouncer for this relatively simple use-case could be eliminated.