Remix.run Logo
brokegrammer 3 days ago

Many people are confused about the usefulness of 1M tokens because LLMs often start to get confused after about 100k. But this is big for Claude 4 because it uses automatic RAG when the context becomes large. With optimized retrieval thanks to RAG, we'll be able to make good use of those 1M tokens.

m4r71n 3 days ago | parent [-]

How does this work under the hood? Does it build an in-memory vector database of the input sources and runs queries on top of that data to supplement the context window?

brokegrammer 3 days ago | parent | next [-]

No idea how it's implemented because it's proprietary. Details here: https://support.anthropic.com/en/articles/11473015-retrieval...

menaerus 2 days ago | parent | prev [-]

RAG commonly implies some sort of vector database to be built and which will then be used for response augmentation. If it operates over the repo, I believe it will index your codebase using those vector embeddings.