Remix.run Logo
kherud 4 hours ago

SQLite seems very powerful for building FTS (user enters free text, expects high precision/recall results). Still, I feel like it's non-trivial to get good search quality.

I think the naive approach is to tokenize the input and append "*" for prefix matching. I'm not too experienced and this can probably be improved a lot. There are many settings like different tokenizers, stemming, etc. Additionally, a lot can be built on top like weighting, boosting exact matches, etc.

Does anyone know good resources for this to learn and draw inspiration from?

fizx 3 hours ago | parent | next [-]

I mean you can use sqlite as an index and then rebuild all of Lucene on top of it. It's non-trivial to build search quality on top of actual search libraries too.

O'Reilly's "Relevant Search" isn't the worst here, but you'll be porting/writing a bit yourself.

subhobroto 3 hours ago | parent | prev [-]

> Does anyone know good resources for this to learn and draw inspiration from?

Is there a reason why something more custom built, like ParadeDB Community edition won't meet your needs?

I understand you're speaking about SQLite, while ParadeDB is PostgreSQL but as you know, it's non-trivial to get good search quality, so I'm trying to understand your situation and needs.