▲ | supo 10 hours ago | |
This article focuses on ways to make "pre-fetching" more accurate, reducing or eliminating the need for reranking to improve latency/cost but also sometimes quality - for example if you use a text cross-encoder to rerank your structured objects, you'll find that those rerankers don't actually understand much of the numbers, locations and other data like that. |