Remix.run Logo
bob1029 5 days ago

Adapting the plan at runtime seems like the most universal solution for optimizer edge cases and is already implemented in the big 3.

If you think about it, the adaptive mechanism doesn't have to be perfect to have a lot of uplift. Even coarse grain detection and starting from zero can make a huge difference if the alternative would be a query burning up the server for hours and a failed batch job.

qazxcvbnm 2 days ago | parent | next [-]

What are some more information on the state of the art in runtime adaptation? I confess I do not feel like I possess such a thing from the databases I regularly use.

Adaptation sounds very compelling; if instead of emitting a plan based on a cardinality estimate, we emit a plan and a range of reasonable intermediate cardinalities together with expected time, and interrupt the original plan when the expectations are exceeded by an order of magnitude, and perform alternative plans based on newly gathered physical information, it sounds like it would be greatly beneficial. Are there concrete reasons that this has not been done (e.g. cost, complexity)?

mike_hearn 2 days ago | parent | next [-]

It has been done. It's just that open source databases lag behind the state of the art.

https://docs.oracle.com/en/database/oracle/oracle-database/2...

Adaptive planning also applies to adapting the degree of query parallelism and automatic banning of query plans that take too much time.

Commercial databases have much bigger teams than Postgres does and have a lot more optimizations as a consequence. This includes optimizations that aren't in scope for an open source DB to begin with. For example, Oracle uses the Apple strategy for decades. A lot of Oracle DBs are sold as hardware not software, shipped as a full cluster rack or set of racks to the customer. The gear uses a dedicated inter-node 100Gb/sec ethernet within the cluster to give horizontal scalability of joins and other complex read/write loads. Queries and results go in/out over a separate frontend network. There are customizations up and down the stack from kernel, firmware, device drivers, database nodes and the client drivers. Nodes directly read/write each other's memory using RDMA to fetch the latest version of rows from each other or inform each other about new transactions.

https://static.rainfocus.com/oracle/ocw24/sess/1716494555804...

You could do stuff like this with Postgres perhaps, if you fork and rewrite enough of it, but it's unlikely to ever be done upstream.

Disclosure: I have a part time role at Oracle. All opinions 100% guaranteed unofficial.

namibj 2 days ago | parent | prev [-]

Both reasons together :D

But yeah you can snatch traversal statistics from your index accesses as you're executing and if those hit planner-provided thresholds you go do what the planner said to do/check in case of that specific trigger, then feed your statistics and the extra queried info to the planner to get your plan adjusted.

Doing this without creating unpredictable performance bugs let alone any correctness bugs is going to be hard, though. Even more when you consider it also has to be fast because otherwise it wouldn't be worth using.

You may want to check WCOJ technology where you can run a multi-way-join with cardinalities being of a "could well be >10, don't know and even if it'll vary across data/keys"-nature:

1. You index all data as needed for perfect index lookup joins on ANY non-stupid join ordering that query could be run as. 2. You prepare separate indices that tell you the number of values your single-key lookup join would return, for each key, for each such possible index lookup join included in (1). 3. For querying you start with a key or key range that could be fed to an index from (1) and then for each tuple at each step between individual joins you ask the candidate indices of (2) which one would thus be the cheapest to expand with, use the selected index of (1) (as told by the fine-grained cardinality counts of (2)) to expand your tuple, then choose the next relation to expand with.

If you ever have a choice where you know the cardinality of a key in (1) is always 0 or 1, you don't need an index of (2) for that index of (1) and you will always greedily apply those, preferring those that have a higher chance of killing the currently evaluating partial tuple from propagation through further indices.

That alone btw. is sufficient to get the worst case optimal runtime of triangle listing/counting in arbitrary edge lists of O(n^(3/2)) (incidentally equal to the maximum possible number of triangles) instead of the naive/classic O(n^2). n being the number of edges.

Constants here for the triangle case are on the order of 10~50 depending on how hard you want to argue vectorizing and using flat array "columnar" data structures instead of in-memory structures that allow incremental updating without brute force rebuilding from scratch at every change. E.g. supporting adding (but not deleting) edges and efficiently getting updated results is something that doesn't fit the most aggressive performance optimizations that would give the ~50 constant factor vs. a "mere" ~10 when you want to support incremental updates.

(At constant=10 break even would be at n=100; at constant=50 break-even would be at n=2500; IMO it's therefore quite relevant in many practical situations with how quick those asymptotes hit results/benefits.)

RaftPeople 2 days ago | parent | prev [-]

> Adapting the plan at runtime seems like the most universal solution for optimizer edge cases and is already implemented in the big 3

But the issues frequently aren't edge cases and frequently are at runtime (i.e. new query), it's just the best the optimizers can do with limited info (cardinality) and a limited duration to evaluate alternatives.

EDIT:

I just realized I misunderstood the post I responded to. I thought adapt meant update the query plan that was previously stored, but I believe the meaning is during execution of the query.