| ▲ | joshstrange 6 months ago | |
And that’s exactly the limited-ness I’m talking about. If that works for you, Clickhouse is amazing. For things like logs I can 100% see the value. Other data that is ETL’d and might need to update? That sucks. | ||
| ▲ | atemerev 6 months ago | parent | next [-] | |
If you can afford rare, batched updates, it sucks much less. Anyway, yes, if your data is highly mutable, or you cannot do batch writes, then yes, Clickhouse is a wrong choice. Otherwise... it is _really_ hard to ignore 50x (or more) speedup. Logs, events, metrics, rarely updated things like phone numbers or geocoding, archives, embeddings... Whoooop — it slurps entire Reddit in 48 seconds. Straight from S3. Magic. If you still want really fast analytics, but have more complex scenarios and/or data loading practices, there's also Kinetica... if you can afford the price. For tiny datasets (a few terabytes), DuckDB might be a great choice too. But Postgres is usually a wrong thing to make work. | ||
| ▲ | edmundsauto 6 months ago | parent | prev | next [-] | |
There are design patterns / architectures that data engineers often employ to make this less "sucky". Data modeling is magical! (Specifically talking about things like datelist and cumulative tables) | ||
| ▲ | slt2021 6 months ago | parent | prev [-] | |
you are doing data warehousing wrong, need to learn basics of data warehousing best practices. Data Warehouse consists of Slowly Changing Dimensions and Facts. none of these require updates | ||