▲ | gvsg-rs a day ago | |
Hi, I’m from Readyset. We hadn’t realized this post had picked up traction here, but I wanted to share a bit more context. Some folks pointed out that index pushdowns and join optimizations aren’t novel. That’s fair. In a traditional database engine, pushdowns and access path selection are standard. But Readyset isn’t a conventional engine. When you create a materialized view in Readyset, the query is compiled into a dataflow graph designed for high-throughput incremental updates, not per-request planning. We receive changes from the upstream database’s replication stream and propagate deltas through the graph. Reads typically hit the cache directly at sub-ms latency. But when a key hasn’t yet been materialized, we perform what we call an upquery -- a one-off pull from the base tables (stored in RocksDB) to hydrate the missing result. Since we don’t re-plan queries on each request, the structure of that upquery, including filter pushdowns and join execution, is precompiled into the dataflow. Straddled joins, where filtering is required on both sides of the join, are especially tricky in this model. Without smarter pushdown, we were overfetching data and doing unnecessary join work. This optimization pushes composite filters into both sides of the join to reduce RocksDB scans and hash table size. It’s a well-known idea in the context of traditional databases, but making it work in a static, incrementally maintained dataflow system is what makes it unique here. Happy to go deeper if folks are curious. Appreciate the thoughtful feedback. |