| ▲ | joefourier 17 hours ago | |
Then what do you call RAG done well? You need a term for it. > And when you hear someone saying "we use RAG here" 95% of the time this is exactly what they mean. That's just Sturgeon's law in action. 95% of every implementation is crap. Back in the 90s, you might have heard "we use OOP here" and come to a similar conclusion, but that doesn't mean you need to invent a new word for doing OOP properly. > But agentic RAG is fundamentally different. From an implementation POV, absolutely not. I've personally gradually converted a dumb semantic search to a more fully featured agentic RAG in small steps like these: - Have a separate LLM call write the query instead of just using the user's message. - Make the RAG search a synthetic injected tool call, instead of appending it to the system prompt. - Improve the search endpoint by using an LLM to pre-process the data into structured chunks with hierarchical categories, tags, and possible search queries, embedding the search queries separately from the desired information (versus originally just having a raw blob). - Have the LLM be able to search both with a semantic sentence, and a list of tags. - Have the LLM view and navigate the hierarchy in a tree-like manner. - Make the original LLM able to call the search on its own instead of being automatically injected using a separate query rewriting call, letting it search in multiple rounds and refine its own queries. When did the system go from RAG to "not RAG"? Because fundamentally, all you need to do to make an agentic RAG is to have the LLM be able to write/rewrite its own search queries (possibly in multiple passes) as opposed to just passing the user's messages(s) directly. | ||
| ▲ | ozim 4 hours ago | parent | next [-] | |
I like the audacity of parent poster that equates 95% of implementations he has seen with 95% of all there is. When it easily could have been 0.01% of all there is. World is much bigger than we think :) | ||
| ▲ | hbrn 11 hours ago | parent | prev [-] | |
>all you need to do to make an agentic RAG is to have the LLM be able to write/rewrite its own search queries (possibly in multiple passes) I think this is a huge oversimplification, the term "search query" is doing a lot of heavy lifting here. When Claude Code calls something like
to understand the project hierarchy before doing any of the grep calls, I don't think it's fair to call it just a "search query", it's more like "analyze query". Just because text goes in and out in both cases, doesn't mean that it's all the same.When you give the agent the ability to query the nature of the data (e.g. hierarchy), and not just data itself, it means that you need to design your product around it. Agentic RAG has entirely different implementation, product implications, cost, latency, and primarily, outcomes. I don't think it's useful to pretend that it's just a different flavor of the same thing, simply because at the end of the day it's just some text flying over the network. | ||