That's what the embedding model is for. It's like a tack-on LLM that works out the relevancy and context to grab.

God knows why you think this is possible. If I don't even know what might be relevant to the conversation in several turns, there's no way an agent could either.

	▲	asixicle 2 hours ago \| parent [-]
		One of us is confusing prediction with retrieval. The embedding model doesn't predict what is going to be relevant in several turns, just on the turn at hand. Each turn gets a fresh semantic search against the full body of memory/agent comms. If the conversation or prompt changes the next query surfaces different context automatically. As you build up a "body of work" it gets better at handling massive, disparate tasks in my admittedly short experience. Been running this for two weeks. Trying to improve it.