Remix.run Logo
zknill 3 hours ago

> "and which ones are no longer relevant."

This is absolutely the hardest bit.

I guess the short-cut is to include all the chat conversation history, and then if the history contains "do X" followed by "no actually do Y instead", then the LLM can figure that out. But isn't it fairly tricky for the agent harness to figure that out, to work out relevancy, and to work out what context to keep? Perhaps this is why the industry defaults to concatenating messages into a conversation stream?

asixicle 3 hours ago | parent | next [-]

That's what the embedding model is for. It's like a tack-on LLM that works out the relevancy and context to grab.

nprateem 2 hours ago | parent [-]

God knows why you think this is possible. If I don't even know what might be relevant to the conversation in several turns, there's no way an agent could either.

asixicle 2 hours ago | parent [-]

One of us is confusing prediction with retrieval. The embedding model doesn't predict what is going to be relevant in several turns, just on the turn at hand. Each turn gets a fresh semantic search against the full body of memory/agent comms. If the conversation or prompt changes the next query surfaces different context automatically.

As you build up a "body of work" it gets better at handling massive, disparate tasks in my admittedly short experience. Been running this for two weeks. Trying to improve it.

vdelpuerto 2 hours ago | parent | prev [-]

Shortcut works sometimes. But if X is common in training and Y is rare, the model regresses on the next turn even with 'do Y, not X' right there in history. @vanviegen's 'fighting instincts' — you can't trust the model to read the correction. Gate it before the model runs instead of inferring it from context