| ▲ | h33t-l4x0r 4 hours ago | |
Hmm, Bloom filters seem important. I'm wondering why my CS education never even touched on them and it's tbh triggering my imposter syndrome. | ||
| ▲ | benmanns 4 hours ago | parent | next [-] | |
They were only touched on (and just barely) in my CS education, so don’t feel too left out. Spend an evening or two on the Wiki for Probabilistic data structures[0]. With a CS education you should have the baseline knowledge to find them really fascinating. Enjoy! Oh, and I don’t find myself actually implementing any of these very often or knowing that they are in use. I occasionally use things like APPROX_COUNT_DISTINCT in Snowflake[1], which is a HyperLogLog (linked in the Wiki). [0]: https://en.wikipedia.org/wiki/Category:Probabilistic_data_st... [1]: https://docs.snowflake.com/en/sql-reference/functions/approx... | ||
| ▲ | vyr 2 hours ago | parent | prev | next [-] | |
they're common in databases and performance instrumentation of various kinds (as are other forms of data structure "sketch" like count sketches) but not as common outside those realms. i've gotten interview questions best solved with them a few times; a Microsoft version involved spell-checking in extremely limited memory, and the interviewer told me that they'd actually been used for that back in the PDP era. | ||
| ▲ | dheera 3 hours ago | parent | prev [-] | |
My education didn't touch upon it but I've been grilled on it multiple times in interviews. I learned about them after the first time I got grilled and rejected. Sucks to be the first company that grilled me about it, thanks for the tip though, you just didn't stick around long enough to see how fast I learn | ||