Many people are confused about the usefulness of 1M tokens because LLMs often start to get confused after about 100k. But this is big for Claude 4 because it uses automatic RAG when the context becomes large. With optimized retrieval thanks to RAG, we'll be able to make good use of those 1M tokens.

▲

m4r71n 3 days ago | parent [-]

How does this work under the hood? Does it build an in-memory vector database of the input sources and runs queries on top of that data to supplement the context window?

	▲	brokegrammer 3 days ago \| parent \| next [-]
		No idea how it's implemented because it's proprietary. Details here: https://support.anthropic.com/en/articles/11473015-retrieval...
	▲	menaerus 2 days ago \| parent \| prev [-]
		RAG commonly implies some sort of vector database to be built and which will then be used for response augmentation. If it operates over the repo, I believe it will index your codebase using those vector embeddings.