Just fyi, for RAG/similarity search, adding a reranker was much bigger pay off than switching embedding models.
What top K do you use for vector search before passing into the reranker?
At a minimum, you increase top-k to cast a wider net, then after reranking, take the N you really want. You have to play around with it a bit, but that’s the idea.