▲ | the_precipitate 4 days ago | ||||||||||||||||
To really appreciate inverted indexes, it’s worthwhile to study ISR (Inverted Stream Readers), a concept introduced by the great Mike Burrows. It’s also worth exploring encoding techniques like PForDelta. These elegant ideas demonstrate how true systems design masters can distill complex concepts into simple, powerful abstractions. Edit: I stand corrected: it's called index stream readers (thanks atombender for pointing this out). For those who knows Mike Burrows only for the Burrows-Wheeler transformation (BZip), you might also want to know that he was also one of the main developers of AltaVista, the first real search engine for the internet. He also designed the early versions of Bing search engine. Eventually he worked for Google and designed their lock service called Chubby. | |||||||||||||||||
▲ | marginalia_nu 4 days ago | parent | next [-] | ||||||||||||||||
Another very nice algorithm in the space is this one[1] for intersecting postings lists in sublinear time generally with very good cache characteristics to boot. Works with tree-based indexes as well as skip lists (though a more modern design might also use simple bloom filters to go with the skip pointers). [1] https://nlp.stanford.edu/IR-book/html/htmledition/faster-pos... | |||||||||||||||||
▲ | atombender 4 days ago | parent | prev [-] | ||||||||||||||||
I think you're thinking of index stream readers? | |||||||||||||||||
|