Remix.run Logo
foobar10000 3 days ago

IMO for some things RAG works great, and for others you may need attention, and hence why the completely disparate experiences about RAG.

As an example, if one is chunking inputs into a RAG, one is basically hardcoding a feature based on locality - which may or may not work. If it works - as in, it is a good feature (the attention matrix is really tail-heavy - LSTMs would work, etc...) - then hey, vector DBs work beautifully. But for many things where people have trouble with RAG, the locality assumption is heavily violated - and there you _need_ the full-on attention matrix.