| ▲ | kgeist 7 hours ago | |||||||||||||||||||||||||||||||
Are there vector DBs with 100B vectors in production which work well? There was a paper which showed that there's 12% loss in accuracy at just 1 mln vectors. Maybe some kind of logical sharding is another option, to improve both accuracy and speed. | ||||||||||||||||||||||||||||||||
| ▲ | lmeyerov 4 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||
I don't know at these scales, but at the 1M-100M, we found switching from out-of-box embeddings to fine-tuning our embeddings gave less of a sting in the compression/recall trade-off . We had a 10-100X win here wrt comparable recall with better compression. I'm not sure how that'd work with the binary quantization phase though. For example, we use Matroyska, and some of the bits matter way more than others, so that might be super painful. | ||||||||||||||||||||||||||||||||
| ▲ | jasonjmcghee 4 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
So many missing details... Different vector indexes have very different recall and even different parameters for each dramatically impact this. HNSW can have very good recall even at high vector counts. There's also the embedding model, whether you're quantizing, if it's pure rag vs hybrid bm25 / static word embeddings vs graph connections, whether you're reranking etc etc | ||||||||||||||||||||||||||||||||
| ▲ | _peregrine_ 6 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||
the solution described in the blog post is currently in production at 100B vectors | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||