Exadata would definitely be an option and it has it's pros (I use it at work but that decision wasn't mine). Some of the features you forgot to mention are:
- The built in bloom filters for reducing number of blocks scanned for FTS
- Oracle (not specific to Exadata) supports creating reverse key indexes to help address high write loads causing contention on indexes, Postgres you can use a reverse function to reverse a string but not the actual byte representation so doesn't work on int/bigint types
Oracle doesn't come without it's drawbacks:
- Oracle is infamous for their licensing costs and audits, TDE for example I believe is a separate licensing cost despite that fact that for most companies encryption as rest is required
- You have to include the cost of DataGuard into your calculations for TCO because there isn't replication built in like Postgres provides for free to handle DR use cases
- You have to include the cost of Oracle's point in time archiving backups for restoration in case of failure for TCO, the Postgres community has come up with a few solutions for this that are free. At this point in time, I wouldn't consider only daily backups acceptable.
- Applications typically work with grapheme clusters, a user doesn't need nor care how many bytes the emojis they filled an input field is, you need to store it and user interfaces are going to enforce 1 emoji as 1 character generally. In existing databases clusters, a lot of places have not turned on extended strings so Oracle strugles to handle Unicode due to its limitation on VARCHAR2 of 4000 BYTES until you turn this on.
- Oracle completely ignores SQL standards isolation levels, whereas Postgres adheres to them pretty well (ignoring repeatable read interpretations which happens with other RDBMS as well)
> it can notice what queries are slow and add indexes for you, automatically
This potentially can cause performance differences in DEV/STG/PROD where an index gets created in STG but then isn't there in PROD. Because of this possibility, it would violate the change management processes where I work but likely wouldn't be an issue for OpenAI.
> Database nodes scale horizontally and elastically. There is not such a thing as a primary master and read replicas in an Oracle cluster
This comes at the cost of the possibility of cluster waits, and I've seen them happen for 10 minute time period and are difficult to track down or optimize for in my experience.
> Database clusters are HA. Node lists are automatically distributed to client drivers. If a node fails or you do a rolling upgrade on the cluster
Everywhere I have worked does not use the full Oracle driver which is what I believe is required for this. Using the full client requires organization coordination that sometimes doesn't exist or won't exist because not every team wants to use Oracle and DevOps doesn't want to support the multitude of ways to use different database drivers. I primarily work in Java and I've only ever seen the JDBC thin client used which as far as I know doesn't support this, you can create connection strings with a list of servers for production with failover to a list of servers in another region designated as your DR database, but I don't believe they're dynamic.
> You can multiplex transactions onto sessions and do db-side connection pooling
This is HIGHLY dependent on what programming language/framework you're using. Spring Boot for example uses Hikari and generally avoids using the internal driver pool which likely prevents this. I also believe using the reactive drivers is required to do this in Java in general. Using this directly means you have a hard dependency to the specific database. While it's rare to change database software it can happen, my work is undergoing a switch from MSSQL to Oracle.
I could be wrong on any of these, considering I have to use Oracle daily at work I'm more than happy to be wrong and learn something since I can apply it immediately.