Yea this is pretty fucking basic stuff. Any competent optimization engine should be doing this. "push down indexes as much as possible" is literally the first thing a query planner should be trying to do

▲

ncruces 3 days ago | parent | next [-]

Yes.

But here they are deciding between "pushdown o.status==shipped" and "pushdown u.email==address@", in parallel both, then join (which they already did) or first doing "u.email==address@" then pushing down "u.id==o.user_id" mostly.

This is a judgment call. Their planner is pretty dumb to not know which one is better, but “push down as much as possible” doesn't cut it: you need to actually decide what to push down and why.

▲

HDThoreaun 3 days ago | parent [-]

No, it is not a judgement call. The query planner should be storing the distributions of the values in every index. This makes it obvious which pushdown to do here. Again, basic stuff. Youre right though not quite simple as "push down as much as possible", it is one step past that.

▲

Jupe 3 days ago | parent | next [-]

Agreed. Isn't this precisely why key statistics (table statistics) are maintained in many DB systems? Essentially, always "push down" the predicate with the worst statistics and always execute (early) the predicates with high selectivity.

I'd be very surprised if virtually every RDBMS doesn't do this already.

▲

jmalicki 3 days ago | parent | prev [-]

Without storing the joint distribution of the values corresponding to the conditions that span multiple tables, it can be hard to know what's a win.

▲

Seattle3503 3 days ago | parent [-]

Postgres by default computes univariate stats for each column and uses those. If this is producing bad query plans, you can extend the statistics to be multivariate for select groups of columns manually. But to avoid combinatorially growth of stats related storage and work, you have to pick the columns by hand.

	▲	jmalicki 2 days ago \| parent [-]
		That still isn't multivariate stats across a join.

▲

zmmmmm 3 days ago | parent | prev [-]

I had to dig through to see the details of what database was really in play here, and sure enough, it's a wrapper around a key-value store (RocksDB). So while I'll confess I know little about RocksDB it does sound an awful lot like they threw out a mature relational database engine with built in optimization and now are in the process of paying the price for that by manually optimizing each query (against a key-value store no less, which probably fundamentally limits what optimizations can be done in any general way).

Would be curious if any RocksDB knowledgeable people have a different analysis.

	▲	hxtk 3 days ago \| parent \| next [-]
		> against a key-value store no less, which probably fundamentally limits what optimizations can be done in any general way I would disagree with this assumption for two reasons: first, theoretically, a file system is a key value store, and basically all databases run on file systems, so it stands to reason that any optimization Postgres does can be achieved as an abstraction over a key-value store with a good API because Postgres already did. Second, less theoretically, this has already been done by CockroachDB, which stores data in Pebble in the current iteration and previously used RocksDB (pebble is CRDB’s Go rewrite of RocksDB) and TiDB, which stores its data in TiKV. A thin wrapper over a KV store will only be able to use optimizations provided by the KV store, but if your wrapper is thick enough to include abstractions like adding multiple tables or inserting values into multiple cells in multiple tables atomically, then you can build arbitrary indices into the abstraction. I wouldn’t tend to call a KV store a bad database engine because I don’t think of it as a database engine at all. It might technically be one under the academic definition of a database engine, but I mostly see it being used as a building block in a more complicated database engine.
	▲	marceloaltmann a day ago \| parent \| prev [-]
		Readyset is an Incremental View Maintenance cache that is powered by a dataflow graph to keep caches (result-set) up-to-date as the underlining data changes on the database (MySQL/PostgreSQL). RocksDB is only used as the persistent storage here, and the whole optimization is done for the DFG execution and not related to the persistent storage itself.