| ▲ | falconroar 3 hours ago | |
Interesting, I wasn't aware; thanks for that. I will say, Polars' implementation is much more centered on out-of-core processing, and bypasses some of DuckDB's limitations ("DuckDB cannot yet offload some complex intermediate aggregate states to disk"). Both incredible pieces of software. To expand on this, Polars' `LazyFrame` implementation allows for simple addition of new backends like GPU, streaming, and now distributed computing (though it's currently locked to a vendor). The DuckDB codebase just doesn't have this flexibility, though there are ways to get it to run on GPU using external software. | ||
| ▲ | noworriesnate 35 minutes ago | parent [-] | |
Have you seen Ibis[1]? It's a dataframe API that translates calls to it into various backends, including Polars and DuckDB. I've messed around with it a little for cases where data engineering transforms had to use pyspark but I wanted to do exploratory analysis in an environment that didn't have pyspark. | ||