▲ | dmezzetti 6 days ago | |
Excellent article on BM25! Author of txtai [1] here. txtai implements a performant BM25 index in Python [2] via the arrays package and storing the term frequency vectors in SQLite. With txtai, the hybrid index approach [3] supports both convex combination when BM25 scores are normalized and reciprocal rank fusion (RRF) when they aren't [4]. [1] https://github.com/neuml/txtai [2] https://neuml.hashnode.dev/building-an-efficient-sparse-keyw... [3] https://neuml.hashnode.dev/benefits-of-hybrid-search [4] https://github.com/neuml/txtai/blob/master/src/python/txtai/... |