| ▲ | twotwotwo 2 days ago | |
There is a largish category of tools now where, unlike in OLTP systems, a big focus is scanning data but quickly (O(n) but with a good constant): Redshift, Trino/Athena, ClickHouse, DuckDB among others. Bloom filter indexing seems like a great fit if you ever need to do substring searches in a context like that, and for log searching in general. I haven't dug into what all packages have it, but it looks like at least ClickHouse does: https://clickhouse.com/docs/optimize/skipping-indexes#bloom-... | ||