Remix.run Logo
HowardStark 4 hours ago

Been playing around with Clickhouse a lot recently and have had a great experience particularly because it hits many of these same points. In my case the "local files" hasn't been a huge fixture but the Parquet and JSON ingestion have been very convenient and I think CH intends for `clickhouse-local` to be some sort of analog to the "add duckdb" point.

One of my favorite features is `SELECT ... FROM s3Cluster('<ch cluster>', 'https://...<s3 url>.../data//.json', ..., 'JSON')`[0] which lets you wildcard ingest from an S3 bucket and distributes the processing across nodes in your configured cluster. Also, I think it works with `schema_inference_mode` (mentioned below) though I haven't tried it. Very cool time for databases / DB tooling.

(I actually wasn't familiar with `union_by_name` but it looks to be like Clickhouse has implemented that as well [1,2] Neat feature in either case!)

[0] https://clickhouse.com/docs/sql-reference/table-functions/s3... [1] https://clickhouse.com/docs/interfaces/schema-inference [2] https://github.com/ClickHouse/ClickHouse/pull/55892