| ▲ | Hybrid search (BM25/vectors/RRF) barely improved over pure semantic | |
| 1 points by pjmalandrino 8 hours ago | ||
My setup: ~600 technical docs (50 pages avg, lots of schemas/diagrams), chunked and embedded with BGE-M3, PgVector as vector DB. Semantic retrieval was ok but not great on our technical docs. Read everywhere that hybrid search with RRF was supposed to be the next level. Implemented it, BM25 + vector + RRF fusion. Result: almost no improvement. Like, negligible. Am I missing something obvious? Is hybrid overhyped on technical docs with lots of schemas/tables or is my setup just broken? | ||