| ▲ | mrbungie 7 months ago | |||||||||||||
I was almost going to build a lakehouse* with DuckDB because I low-key love it, easiest and strongest analytical engine I've found yet: scale from laptops to big metal, while being mostly out-of-core when doing sane stuff, and avoiding distributed computing for SQL in the process (looking at you Spark). That is until I found out it does not support Iceberg writes[1], big nono as I would need another engine for inserts, and I want a simple stack :(. What a bummer. [1] https://github.com/duckdb/duckdb_iceberg/issues/37 *that is what they are called now aren't they? I just can't follow the terms anymore haha. | ||||||||||||||
| ▲ | nicornk 7 months ago | parent | next [-] | |||||||||||||
Fivetran tried to upstream write support but it was not accepted https://github.com/duckdb/duckdb-iceberg/pull/95 | ||||||||||||||
| ||||||||||||||
| ▲ | jeadie 7 months ago | parent | prev | next [-] | |||||||||||||
This is one of the ideas behind using DuckDB in github.com/spiceai/spiceai | ||||||||||||||
| ||||||||||||||
| ▲ | mritchie712 7 months ago | parent | prev | next [-] | |||||||||||||
it's coming. they already have hive style parquet writes. Iceberg is more complicated than that, but it's certainly doable. | ||||||||||||||
| ||||||||||||||
| ▲ | buremba 7 months ago | parent | prev | next [-] | |||||||||||||
Not just for building a new one, it can also complement existing data-warehouse/lakehouses: https://github.com/buremba/universql The flight extension is excellent as it removes the need to write C++ extensions and lets you use your favorite language to develop native DuckDB catalogs. It's straightforward to build data lake connectors and plug them in as a flight catalog, thanks to Airport! | ||||||||||||||
| ▲ | benrutter 7 months ago | parent | prev | next [-] | |||||||||||||
I'm curious, did you consider delta tables? Pretty sure duckdb supports them nicely. If you did, how come you chose not to go with them? | ||||||||||||||
| ||||||||||||||
| ▲ | sukhavati 7 months ago | parent | prev [-] | |||||||||||||
same here man, ended up going with trino explicitly for writing and data management and using chdb/duckdb to process data for front-ends etc (mostly ethereum data so chdb "support" for ui256 is quite important) | ||||||||||||||