▲ | williamdclt 15 hours ago | ||||||||||||||||
Not super interesting, this is fairly basic stuff that you'll encounter at orders of magnitude smaller scale than OpenAI. Creating indexes CONCURRENTLY, avoiding table rewrites, smoothing out traffic, tx timeouts, read replicas... It's pretty much table stakes, even at 10000x smaller scale. Their requests to Postgres devs aren't anything new either, everyone has wished for it for years. The title is kind of misleading: they're not scaling it to the "next level", they're clearly struggling with this single-master setup and trying to keep it afloat while migrating off ("no new workloads allowed"). The main "next scale" point is that they say they can "scale gracefully under massive read loads" - nothing new, that's the whole point of read replicas and horizontal scaling. Re: "Lao Feng Q&A": > PostgreSQL actually does have a feature to disable indexes. You can simply set the indisvalid field to false in the pg_index system catalog [...] It’s not black magic. No. It's not documented for this use, so it's not a feature. It's fooling around with internals without guarantees of what this will do (it might do what you want today, it might not in the next release). Plus as they point out, managed Postgres providers don't let you fiddle with this stuff (for good reasons, as this is not a feature). > there’s a simpler solution [to avoiding accidental deletion of used indexes]: just confirm via monitoring views that the index is not being used on either primary or replicas That doesn't quite solve all the same problems. It's quite frequent that an index is in use, but is not _needed_: another index would also work (eg you introduced a new index covering an extra column that's not used in this query). Being able to disable an index would allow checking that the query plan does use the other index, rather than praying and hoping. | |||||||||||||||||
▲ | sgarland 13 hours ago | parent | next [-] | ||||||||||||||||
> they're clearly struggling with this single-master setup and trying to keep it afloat while migrating off ("no new workloads allowed"). TFA states they’re at 1 million QPS, in Azure. 1 million QPS with real workloads is impressive, doubly so from a cloud provider that’s almost certainly using network-based storage. EDIT: they have an aggregate of 1 million QPS across ~40 read replicas, so 25K QPS each, modulo writes. I am less impressed. > That doesn't quite solve all the same problems. It's quite frequent that an index is in use, but is not _needed_: another index would also work (eg you introduced a new index covering an extra column that's not used in this query). Being able to disable an index would allow checking that the query plan does use the other index, rather than praying and hoping. Assuming your table statistics are decently up to date and representative (which you can check), this basically comes down to knowing your RDBMS, and your data. For example, if it's a text-type column, do both indices have the same operator class (or lack thereof)? Does the new index have a massive column in addition to the one you need, or is it reasonably small? Do the query projections and/or selections still form a left-most prefix of the index (especially important if any queries perform ordering)? | |||||||||||||||||
| |||||||||||||||||
▲ | jfim 15 hours ago | parent | prev [-] | ||||||||||||||||
I'm pretty perplexed as well. They mention that they're not sharding PostgreSQL, without mentioning why in the article, but isn't that an obvious issue to many of their scaling problems? I don't really see what it is that they're doing that requires a single master database, it seems that sharding on a per user basis would make things way easier for them. | |||||||||||||||||
|