RAG is nowhere near obselete. Model performance on enormous sequences degrades hugely as they are not well represented in training and non quadratic attention approximations are not amazing