| ▲ | srameshc 5 days ago | |||||||
I love SQLite and this is in no way I'm making a point devaluing SQLite, Author's method is excellent approach to get analytical speed out of SQLite. But I am loving DuckDB for similar analytical workloads as it is built for such tasks. DuckDB also reads from single file, like SQLite and DuckDB process large data sets at extreme speeds. I work on my macbook m2 and I have been dealing with about 20 million records and it works fast, very fast. Loading data into DuckDB is super easy, I was surprised : SELECT avg(sale_price), count(DISTINCT customer_id) FROM '/my-data-lake/sales/2024/*.json'; and you can also load into a JSON type column and can use postgres type syntax col->>'$.key' | ||||||||
| ▲ | loa_observer 5 days ago | parent | next [-] | |||||||
duckdb is super fast for analytic tasks, especially when u use it with visual eda tool like pygwalker. it allows u handles millions of data visuals and eda in seconds. but i would say, comparing duckdb and sqlite is a little bit unfair, i would still use sqlite to build system in most of cases, but duckdb only for analytic. you can hardly make a smooth deployment if you apps contains duckdb on a lot of platform | ||||||||
| ||||||||
| ▲ | mikepurvis 5 days ago | parent | prev | next [-] | |||||||
Whoa. Is that first query building an index of random filesystem json files on the fly? | ||||||||
| ||||||||
| ▲ | lame_lexem 5 days ago | parent | prev [-] | |||||||
can we all agree to never store datasets uncompressed. duckdb supports reading many compression formats | ||||||||
| ||||||||