| ▲ | kherud 4 hours ago | |
SQLite seems very powerful for building FTS (user enters free text, expects high precision/recall results). Still, I feel like it's non-trivial to get good search quality. I think the naive approach is to tokenize the input and append "*" for prefix matching. I'm not too experienced and this can probably be improved a lot. There are many settings like different tokenizers, stemming, etc. Additionally, a lot can be built on top like weighting, boosting exact matches, etc. Does anyone know good resources for this to learn and draw inspiration from? | ||
| ▲ | fizx 3 hours ago | parent | next [-] | |
I mean you can use sqlite as an index and then rebuild all of Lucene on top of it. It's non-trivial to build search quality on top of actual search libraries too. O'Reilly's "Relevant Search" isn't the worst here, but you'll be porting/writing a bit yourself. | ||
| ▲ | subhobroto 3 hours ago | parent | prev [-] | |
> Does anyone know good resources for this to learn and draw inspiration from? Is there a reason why something more custom built, like ParadeDB Community edition won't meet your needs? I understand you're speaking about SQLite, while ParadeDB is PostgreSQL but as you know, it's non-trivial to get good search quality, so I'm trying to understand your situation and needs. | ||