Remix.run Logo
marcyb5st 6 days ago

I built a reranker for a RAG system using a tiny model. After the candidate generation (i.e. vector search + BM25) and business logic filters/ACL checks the remainder of the chunks went through a model that given the user query told you whether or not the chunk was really relevant. That hit production, but once the context size of models grew that particular piece was discarded as passing everything yielded better results and prices (the fact that prices of input tokens went down also played a role I am sure).

So only for a while, but it still counts :)