| ▲ | vouwfietsman 3 hours ago | |
> DuckDB wouldn't really know what to do with a Sure it would, you can attach a multi-table sqlite database in duckdb > that does not mean just because it came first I agree with most of your points, I am not stating my opinion but my observations. I am the target audience here, I want to use this, but I don't really care too much about the file format itself, at least not as much as I care about the data inside. That means access, which means compatibility with my tooling. Compatibility is hard to beat. This is the concorde of file formats. | ||
| ▲ | aduffy 3 hours ago | parent [-] | |
That is fair. FWIW I think if you are just doing pure analytics and nothing else, Parquet will probably continue to do the job for you just fine, and you don't need to touch your workloads at all. These new formats I think will find a niche where people aren't just running Spark jobs, but doing lots of systems building over large tables. If you're building a PB-scale data warehouse, you care a lot about the file format b/c it is a big factor in your performance curve, and you're willing to ship new experimental codecs in response to new datatypes you want to support that the system wasn't originally designed for, or you want to use a newly invented compressor. | ||