Remix.run Logo
pattar 12 hours ago

I went to a talk by the Motherduck team about why they built DuckLake instead of leaning more in on Iceberg. The key takeaway is that instead of storing all the table metadata inside files on s3 and dealing with latency and file io they instead store all of that info inside a duckdb table. Seems like a good idea and worked smoothly when I tried, however it is not quite in a stable production state it is still <1.0. They have a nice talk about it on youtube: https://youtu.be/hrTjvvwhHEQ?si=WaT-rclQHBxnc9qV

willvarfar 9 hours ago | parent [-]

(I work a lot with BigQuery's BigLake adaptor and it's basically caching the metadata of the iceberg manifest and parquet footers in Bigtable (this is Google) so query planning is super fast etc. Really helps)