Remix.run Logo
mike_hearn a month ago

Thanks for the insight! Some of these things were new to me :)

I didn't mention audits and the like because OpenAI are in the cloud, and there are obviously no audits for managed databases. That's relevant for self-hosted setups.

Both options I specced out included a replica, iirc. At least that's what Microsoft's HA option appears to be (it doubles the price, indeed).

One of the problems in comparing with Postgres is people tend to advertise mutually exclusive options, sometimes in different comments by different people so it's not their fault. Anything that requires an extension might as well not exist for most Postgres users because they want to outsource DB management to the cloud, and there the cloud provider chooses the extensions not the user. Oracle doesn't have this problem because all the features are built-in and managed databases have access to everything.

Postgres offers strictly serializable isolation indeed, but IIUC it's basically a form of read locking so will tank performance.

Re: cluster waits. Multi-node clusters will inevitably have failure modes a single machine doesn't unfortunately, I'd prefer to always scale up first before scaling out. One of the advantages of a managed database is that this becomes the cloud provider's problem.

Re: node list distribution. I think that's what the SCAN feature does, which is supported in the thin driver: https://docs.oracle.com/en/database/oracle/oracle-database/2...

Yes, the advanced features are typically not exposed by generic middleware and you will have to change things to take advantage of them, but this is a general issue that affects all kinds of software. Features specific to the Linux kernel are often not exposed in Python or Java either, for instance. Some features are easily enabled by just dropping in UCP, which should be an easy upgrade from Hikari. Very new stuff like tx multiplexing requires more work by the app developer to exploit currently, but hopefully with time frameworks will catch up. Point is, the support is there if you need it. Compared to the gymnastics OpenAI are going through, it's easy stuff.

anarazel a month ago | parent | next [-]

> Postgres offers strictly serializable isolation indeed, but IIUC it's basically a form of read locking so will tank performance.

Postgres' SSI [1] does not block reads. If you have lots of transactions reading and updating a lot of rows the granularity of tracking will become coarser to keep memory usage in bound though.

[1] https://drkp.net/papers/ssi-vldb12.pdf

mike_hearn a month ago | parent [-]

You're right, I mis-remembered - Postgres calls them "SIREAD locks" but they aren't actually locks. The model is based on aborting transactions.

The issue with this - and I'm not saying it's a bad feature because it's not and I'd like to have support for every isolation level everywhere - is that most apps can't tolerate transaction aborts. They need to be written to expect it and loop. Looping can easily turn into livelock if you aren't careful and there aren't great tools or known best practices for handling such a system.

You also have to be careful because such transactions are only strictly serializable vs other transactions run at the same isolation level. So if you accidentally allow in a mix of transactions you are still exposed to isolation anomalies.

So it's a useful feature but not something most apps can easily port to.

jose_zap a month ago | parent | prev | next [-]

> Postgres offers strictly serializable isolation indeed, but IIUC it's basically a form of read locking so will tank performance.

It will not. We use it at work for all transactions and its is very performant.

joshsm5 a month ago | parent | prev [-]

Sorry, holiday, but I was referring to having a separate region for failover, which is called "Geo redundant backup" in Azure Database for Postgres. Whereas with Exadata you need an entirely separate Exadata rack in a different region with the licensing of DataGuard on top of it.

There are a lot of extensions that most cloud providers support, auto range partitioning isn't built into Postgres like Oracle but it's supported with an extension for example. Azure Database supports more than AWS last I checked. https://learn.microsoft.com/en-us/azure/postgresql/extension...

I will have to check with our DBAs if we use the scan feature or not, appreciate you pointing it out.