▲ | the_real_cher a day ago | |
What is the trick that this and dynamo use? Are they just basically large hash tables? | ||
▲ | valyala 6 hours ago | parent [-] | |
There are two tricks used by ClickHouse and similar databases: - Smart placement of the data on disk, which allows skipping the majority of data and reading only the needed chunks (and these chunks are stored in a compressed form in order to reduce disk read IO usage even more). This includes column-oriented storage and LSM-like trees. - Brute-force optimizations all over the place, which allow processing the found data at the maximum speed by employing all the compute resources (CPU, RAM, disk IO, network bandwidth) in the most efficient way. For example, ClickHouse can process more than a billion of rows per second per every CPU core, and the scan speed scales linearly with the number of available CPU cores. |