▲ | jerpint 8 days ago | |
I wonder at what point it will be ~as much overhead to pass through a subset of the data with a small yet capable and fast LLM vs. using a crude dot product when doing retrieval | ||
▲ | Pringled 7 days ago | parent [-] | |
I think a combination works quite well: first getting a small set of candidates from all the data using a lightweight model, and the using a heavy-duty model to rerank the results and get the final candidates. |