Remix.run Logo
marginalia_nu 4 days ago

This is to be seen as metaphorical to give a mental model for the actual data structures on disk so there's some tradeoff to finding the most accurate metaphor for what is happening.

I actually think you are right, list<pair<...>> is a bit of a weird choice that doesn't quite convey the data structures quite well. Map is better.

The most accurate thing would probably be something like map<term_id, map<document_id, pair<document_id, positions_idx>>>, but I corrected it to just a map<document_id, positions_idx> to avoid making things too confusing.

ch33zer 4 days ago | parent [-]

Currently it looks like this:

    map<term_id, 
      map<pair<document_id, positions_idx>>
      inverted_index;
list<positions> positions;

Think you also meant to remove the pair in map<pair>?

marginalia_nu 4 days ago | parent [-]

Haha, apparently very hard to get this right. Fixed again.