▲ | joshsm5 a month ago | ||||||||||||||||||||||||||||
Exadata would definitely be an option and it has it's pros (I use it at work but that decision wasn't mine). Some of the features you forgot to mention are: - The built in bloom filters for reducing number of blocks scanned for FTS - Oracle (not specific to Exadata) supports creating reverse key indexes to help address high write loads causing contention on indexes, Postgres you can use a reverse function to reverse a string but not the actual byte representation so doesn't work on int/bigint types Oracle doesn't come without it's drawbacks: - Oracle is infamous for their licensing costs and audits, TDE for example I believe is a separate licensing cost despite that fact that for most companies encryption as rest is required - You have to include the cost of DataGuard into your calculations for TCO because there isn't replication built in like Postgres provides for free to handle DR use cases - You have to include the cost of Oracle's point in time archiving backups for restoration in case of failure for TCO, the Postgres community has come up with a few solutions for this that are free. At this point in time, I wouldn't consider only daily backups acceptable. - Applications typically work with grapheme clusters, a user doesn't need nor care how many bytes the emojis they filled an input field is, you need to store it and user interfaces are going to enforce 1 emoji as 1 character generally. In existing databases clusters, a lot of places have not turned on extended strings so Oracle strugles to handle Unicode due to its limitation on VARCHAR2 of 4000 BYTES until you turn this on. - Oracle completely ignores SQL standards isolation levels, whereas Postgres adheres to them pretty well (ignoring repeatable read interpretations which happens with other RDBMS as well) > it can notice what queries are slow and add indexes for you, automatically This potentially can cause performance differences in DEV/STG/PROD where an index gets created in STG but then isn't there in PROD. Because of this possibility, it would violate the change management processes where I work but likely wouldn't be an issue for OpenAI. > Database nodes scale horizontally and elastically. There is not such a thing as a primary master and read replicas in an Oracle cluster This comes at the cost of the possibility of cluster waits, and I've seen them happen for 10 minute time period and are difficult to track down or optimize for in my experience. > Database clusters are HA. Node lists are automatically distributed to client drivers. If a node fails or you do a rolling upgrade on the cluster Everywhere I have worked does not use the full Oracle driver which is what I believe is required for this. Using the full client requires organization coordination that sometimes doesn't exist or won't exist because not every team wants to use Oracle and DevOps doesn't want to support the multitude of ways to use different database drivers. I primarily work in Java and I've only ever seen the JDBC thin client used which as far as I know doesn't support this, you can create connection strings with a list of servers for production with failover to a list of servers in another region designated as your DR database, but I don't believe they're dynamic. > You can multiplex transactions onto sessions and do db-side connection pooling This is HIGHLY dependent on what programming language/framework you're using. Spring Boot for example uses Hikari and generally avoids using the internal driver pool which likely prevents this. I also believe using the reactive drivers is required to do this in Java in general. Using this directly means you have a hard dependency to the specific database. While it's rare to change database software it can happen, my work is undergoing a switch from MSSQL to Oracle. I could be wrong on any of these, considering I have to use Oracle daily at work I'm more than happy to be wrong and learn something since I can apply it immediately. | |||||||||||||||||||||||||||||
▲ | mike_hearn a month ago | parent [-] | ||||||||||||||||||||||||||||
Thanks for the insight! Some of these things were new to me :) I didn't mention audits and the like because OpenAI are in the cloud, and there are obviously no audits for managed databases. That's relevant for self-hosted setups. Both options I specced out included a replica, iirc. At least that's what Microsoft's HA option appears to be (it doubles the price, indeed). One of the problems in comparing with Postgres is people tend to advertise mutually exclusive options, sometimes in different comments by different people so it's not their fault. Anything that requires an extension might as well not exist for most Postgres users because they want to outsource DB management to the cloud, and there the cloud provider chooses the extensions not the user. Oracle doesn't have this problem because all the features are built-in and managed databases have access to everything. Postgres offers strictly serializable isolation indeed, but IIUC it's basically a form of read locking so will tank performance. Re: cluster waits. Multi-node clusters will inevitably have failure modes a single machine doesn't unfortunately, I'd prefer to always scale up first before scaling out. One of the advantages of a managed database is that this becomes the cloud provider's problem. Re: node list distribution. I think that's what the SCAN feature does, which is supported in the thin driver: https://docs.oracle.com/en/database/oracle/oracle-database/2... Yes, the advanced features are typically not exposed by generic middleware and you will have to change things to take advantage of them, but this is a general issue that affects all kinds of software. Features specific to the Linux kernel are often not exposed in Python or Java either, for instance. Some features are easily enabled by just dropping in UCP, which should be an easy upgrade from Hikari. Very new stuff like tx multiplexing requires more work by the app developer to exploit currently, but hopefully with time frameworks will catch up. Point is, the support is there if you need it. Compared to the gymnastics OpenAI are going through, it's easy stuff. | |||||||||||||||||||||||||||||
|