▲ | kingkongjaffa 5 days ago | |
The steps in this article are also the same process for doing RAG as well. You computer an embedding vector for your documents or chunks of documents. And then you compute the vector for your users prompt, and then use the cosine distance to find the most semantically relevant documents to use. There are other tricks like reranking the documents once you find the top N documents relating to the query, but that’s basically it. Here’s a good explanation |