Remix.run Logo
mtrifonov 2 hours ago

You're right but I think you're describing flat memory. The agent gets distracted because every old fact has the same weight as the current one. That's a salience problem.

What works in production for me is typed memory with very different decay curves. Personality and relationships are essentially permanent. Preferences fade in months. Stated intent fades in weeks. Emotion and events fade in days. Reinforcement (repeated recall) keeps things alive regardless of type.

Cross-project co-mingling stops because project-specific stuff actually decays out of relevance while who the user is persists. There's also a filter on what even gets written, which scopes between globally and locally-relevant information and writes accordingly (if at all). Most of the noise you're describing comes from systems that store everything they observe.

Flat memory failing is real. Memory failing in general is a stronger claim than that.

SwellJoe 2 hours ago | parent [-]

I'm making the stronger claim. I don't think memory (at least, what people call "memory", even though it isn't...the memories LLMs have are baked in at training, everything else is context), no matter how fancy, improves outcomes, at least for the work I do on the software I work on. I just don't think the agent needs what people are calling memory.

I think the base truth is the code, which can be loaded into context at no greater cost than whatever "memory" system you're using, probably lower cost, actually. A few hints in documentation fills out the rest of the picture.

You can't realistically give an LLM memory, as current technology doesn't allow retraining the model on the fly. You can only give it more data to ingest into its context. Unless that data is directly relevant to the task at hand, it's probably detrimental. At best, it is just burning tokens for no benefit.

netcan 2 hours ago | parent [-]

Useful comment. Thanks.