Remix.run Logo
endymi0n 4 hours ago

I've experimented quite a bit with mem0 (which is similar in design) for my OpenClaw and stopped using it very soon. My impression is that "facts" are an incredibly dull and far too rigid tool for any actual job at hand and for me were a step back instead of forward in daily use. In the end, the extracted "facts database" was a complete mess of largely incomplete, invalid, inefficient and unhelpful sentences that didn't help any of my conversations, and after the third injected wrong fact I went back to QMD and prose / summarization. Sometimes it's slightly worse at updating stuck facts, but I'll take a 1000% better big picture and usefulness over working with "facts".

The failure modes were multiple: - Facts rarely exist in a vacuum but have lots of subtlety - Inferring facts from conversation has a gazillion failure modes, especially irony and sarcasm lead to hilarious outcomes (joking about a sixpack with a fat buddy -> "XYZ is interested in achieving an athletic form"), but even things as simple as extracting a concrete date too often go wrong - Facts are almost never as binary as they seem. "ABC has the flights booked for the Paris trip". Now I decided afterwards to continue to New York to visit a friend instead of going home and completely stumped the agent.

pranabsarkar 3 hours ago | parent [-]

Fair criticism — and the failure modes you describe aren't mem0-specific, they hit any system that extracts atomic facts from conversation. I hit a couple of them today while benchmarking YantrikDB's own consolidation (see my reply to polotics): "Alice is CEO" got merged with "Sarah is CTO" on cosine similarity alone because the sentences share too much structural scaffolding. That's exactly the "facts in a vacuum" problem you're naming.

Two small clarifications:

remember(text, importance, domain) takes a free-form string — nothing forces atomic facts. A QMD-style prose block, a procedure, a dated plan, all work. The irony/sarcasm-inverts-the-fact failure mode lives in the agent's extraction layer, not the backend. So "write narrative into it, recall narrative out" is a legitimate usage pattern; the DB is agnostic.

YantrikDB's actual differentiator vs mem0 is temporal decay + consolidation + conflict detection, not smarter fact extraction. The "ABC has the Paris flight booked → actually I'm going to NYC" problem is meant to be addressed by decay (the old fact fades) and contradiction flagging (the new one triggers a conflict for the agent to resolve). But — honest read — my bench today showed conflict detection needs work to actually fire on raw text. Filed as issues #1 and #2, fixing now.

Broader point stands though: if the agent is producing brittle inferred facts upstream, no memory backend saves it. The DB can manage rot and contradiction. It can't fix bad inference. For what it's worth, I mostly use it for durable role context ("user is a data scientist on observability") rather than event lifecycle ("Paris flight booked") — the latter is what prose summarization is genuinely better at, and I think you're right that mem0-style auto-extraction applied to lifecycle events is a bad shape.