Remix.run Logo
yabones 5 hours ago

I'm curious how this might help with our biggest downtime-causer with postgres, which is major version upgrades. Poolers do a great job for failover and load balancing, but we consistently need ~10-20 minutes of downtime once or twice a year to do upgrades. Logical replication between old->new versions could probably help, but it would still require flipping everything over to the new cluster without partial writes or anything silly. Anybody have experience with this?

tempest_ 5 hours ago | parent | next [-]

We use logical replication and a pause / swap in pgbouncer for ~5s of paused (but not failed) writes.

This is for DBs that are ~1-1.5TB but doesnt have a huge amount of churn/qps

Effectively what is described here https://www.pgedge.com/blog/always-online-or-bust-zero-downt...

tux3 5 hours ago | parent | prev | next [-]

Logical replication is how this is typically done. If you have some infra-as-code setup, you create a new cluster with identical settings except for the major version, import the schema, start copying data from a read-replica running the old version, stop accepting writes from the old version (downtime starts), sync the sequence numbers, and point your services to the new cluster (downtime ends).

If you use something like CloudNativePG they automate parts of the process with cli tools and declarative syntax. Otherwise you take the time to figure it out by hand. It might sound complicated, but just practice on your staging DB, and if all goes well you do the same procedure in prod.

Edit: Apparently Postgres 19 has a patch for one-shot logical replication of sequences! https://www.depesz.com/2025/11/11/waiting-for-postgresql-19-...

paulryanrogers 4 hours ago | parent [-]

RDS has blue green deployments that can help. It was rough at first, though seems they worked out the kinks.

boxed 5 hours ago | parent | prev | next [-]

Seconded. Coming from MySQL this is a huge regression that makes Postgres look like something from the 80s. I still wonder why this isn't seen as the absolutely highest priority.

jeltz 5 hours ago | parent | next [-]

I have not ran MySQL for some years but it at least used to have exactly the same issue. Upgrading a database with MySQL can take a long time if you have many tables. The main difference is only really that PostgreSQL does it with a separate tool, pg_upgrade, while MySQL does it as part of the main binary.

For both MySQL and PostgreSQL you will need to use some kind of logical upgrades if you want no downtime.

boxed 4 hours ago | parent | next [-]

They don't change the on-disk structure all the time though...

jeltz an hour ago | parent [-]

Mostly because MySQL development is slower.

tomnipotent 5 hours ago | parent | prev [-]

MySQL has advocated for decades spinning up a replica with the upgraded version, waiting for it to catch up to master before promoting it to the new master. You can do the same thing with Postgres.

jeltz 5 hours ago | parent [-]

Exactly, MySQL and PostgreSQL are the same here. Maybe one is a bit faster than the other at doing major version upgrades but the behaviours are quite similar.

Blackthorn 5 hours ago | parent | prev [-]

Probably because it's an open source project and apparently none of its users cared about this feature enough to develop it or fund it.

jeltz 4 hours ago | parent | next [-]

It is also a bit tricky tradeoff. You do not want to be stuck with the same data format forever. So databases like MySQL and PostgreSQL need a downtime when doing a major version upgrade. They both try to keep it short, usually seconds, but minutes can happen in either database.

5 hours ago | parent | prev [-]
[deleted]
tschellenbach 5 hours ago | parent | prev | next [-]

Logical replication solves this. You roll the cluster, downtime is minimal. like 60s maybe.

briffle 4 hours ago | parent [-]

Logical replication needs a special 'upgrade' use case that will automate most of its pain points away. I understand why DDL does not replicate, and that you may want to replicate to a data warehouse that only needs some columns, etc, but there should be a case just for upgrading that handles all DDL, sequences all existing everything, and just works...

znpy 4 hours ago | parent | prev [-]

It's weird that PostgreSQL still doesn't have a proper, open source, general multi-master implementation.

At this point i wonder if i'll ever see that.

jjice 4 hours ago | parent | next [-]

Do other RDBMSs have this? I genuinely have no clue. I've been fortunate enough to be able to get away with one primary and multiple secondaries at my largest usage of Postgres. Multi-master is the kind of thing I am fully out of my depth on, so I'm curious if there's a well defined path for implementation here or what.

hylaride 3 hours ago | parent | next [-]

Commercial RDBMS (oracle/mssql) have had it in some form for awhile, with pluses and minuses. Open source DBs have had bolt-ons, including BDR for pgsql.

Multi-master is hard. The main issue is what to do with commit/replication lag. It's far "easier" if support for eventual consistency is ok with your use case. In some cases it's not. Also, the problems related to read-only lag can happen on multi-master instances. If somebody does a giant long running query on one of the masters, the target instance needs to hold the data state for the query, even if the underlying DB is getting updates. It also needs to still keep up with other masters. This means the whole cluster can slow down if the multi-master replication is synchronous. Depending on a variety of factors, that can chew up disk space, memory, etc.

There are ways of dealing with these issues (and others), but it comes with tradeoffs with performance, etc.

aynyc 4 hours ago | parent | prev [-]

MySQL has Galera cluster for that.

dpedu 3 hours ago | parent [-]

And Group Replication

znpy 9 minutes ago | parent [-]

And percona xtradb cluster

timacles 4 hours ago | parent | prev [-]

It has been tried many times. Good luck to pgdog, but there’s a reason these projects don’t stick.

Multi master, from even a conceptual perspective, is incredibly complicated. Databases, transactions, consistency, parallelism are all very complicated.

It’s something that always seems promising at the start but as soon as maintenance and long term improvements enter the picture(ie integrating new Postgres versions), the complexity becomes too much.