That's because there is no lock-in in the current ecosystems for AI. Yet. But once AIs become your lifetime companion that know everything there is to know about you and the lock-in is maximized (imagine leaving your AI provider will be something like a divorce with you losing half your memory) these parties will flock to it.

The blessing right now is the limit to contextual memory. Once those limits fall away and all of your previous conversations are made part of the context I suspect the game will change considerably, as will the players.

▲

IceHegel 6 days ago | parent | next [-]

There's a chance this memory problem is not going to be that easy to solve. It's true context lengths have gotten much longer, but all context is not created equal.

There's like a significant loss of model sharpness as context goes over 100K. Sometimes earlier, sometimes later. Even using context windows to their maximum extent today, the models are not always especially nuanced over the long ctx. I compact after 100K tokens.

▲

Ozzie_osman 6 days ago | parent | next [-]

But you don't have to hold the entire memory in context. You just need to perfect techniques to pull in parts of the context that you need. This can be done via RAG, multi-agent architectures, etc. It's not perfect but it will get better over time.

▲

elorant 6 days ago | parent | prev | next [-]

From my experience context window by itself tells half the story. You load a big document that’s 200k tokens and ask it a question, it will answer just fine. You start a conversation that soon enough balloons past 100k then it starts losing coherence pretty quickly. So I guess batch size plays a more significant role.

▲

IceHegel 4 days ago | parent [-]

By batch size, do you mean the number of tokens in the context window that were generated by the model vs. external tokens?

Because my understandings is that, however you get to 100K, the 100,001st token is generated the same way as far as the model is concerned.

	▲	4 days ago \| parent [-]
		[deleted]

▲

luckydata 6 days ago | parent | prev | next [-]

I'm over simplifying here but graph database and knowledge graphs exist. An LLM doesn't need to preserve everything in context, just what it needs for that conversation.

	▲	IceHegel 4 days ago \| parent [-]
		Unless there is a trick that I am missing, I don't think this will work by itself. The fundamental thing is what can the model attend to as it generates the next token. If you give a summary+graph to the model, it can still only attend to the summary for token 1. If it's going to call a tool for a deeper memory, it still only gets the summary when it makes the decision on what to call. You get the same problem when asking the model to make changes in even medium-sized code bases. It starts from scratch each time, takes forever to read a bunch of files, and sometimes it reads the right stuff, other times it doesn't.

▲

spiderfarmer 6 days ago | parent | prev [-]

Context will need to go in layers. Like when you tell someone what you do for a living, your first version will be very broad. But when they ask the right questions, you can dive into details pretty quick.

▲

visarga 6 days ago | parent | prev | next [-]

Export your old chats and put them in a RAG system accessible on the new LLM provider. I did it. I made my chat history into a MCP tool I can use with Claude Desktop or Cursor.

Ever since I started taking care of my LLM logs and memory, I had no issue switching model providers.

	▲	luckydata 6 days ago \| parent [-]
		Do you have some kind of tooling to automate the process? Would like to try it.

▲

sebastianz 6 days ago | parent | prev | next [-]

> But once AIs become your lifetime companion that know everything there is to know about you and the lock-in is maximized

Why? It's just a bunch of text. They are forced by law to allow you to export your data - so you just take your life's "novel" and copy paste it into their competition's robot.

	▲	Steve16384 6 days ago \| parent [-]
		It's never quite that straightforward, or perceived as that straightforward. That's why most people just renew their insurance as it's easier than messing about changing and worrying if it will be any better. And how easy is it to transfer emails to another provider?

▲

zwnow 6 days ago | parent | prev | next [-]

Who even wants all your previous conversations taken into account for everything you do? How do you grow from never forgetting anything, making mistakes, etc? This is highly dystopian and I sure hope this will forever just be a fantasy.

▲

visarga 6 days ago | parent [-]

I have made 100MB of my own chat logs into a RAG memory and was surprised I didn't like using it much. Why? it floods the LLM with so much prior thinking that it loses the creative spark. I now realize the sweet spot is in the middle - don't recall everything, strategic disclosure to get the max out of AI. LLM memory should be like a sexy dress - not too long, not too short. You get the most creative outputs when you hide part of your prior thinking and let the mode infer it back.

	▲	zwnow 6 days ago \| parent [-]
		I am not an AI enthusiast but I get what you're saying. I occasionally use ChatGPT due to Google being enshittified pretty much. I often do not like the things it tells me and I for sure do not like it complimenting everything I do, but thats something other people seem to like... In my experience starting a fresh chat after a while of back and forth can really help, so I agree with you. Having little to zero prior context is actually the point of view one needs sometimes.

▲

paool 6 days ago | parent | prev | next [-]

in order for that lifetime companion, we'll need to make a leap in agentic memory.

how do you know memory won't be modular and avoid lock-in?

I can easily see a decentralized solution where the user owns the memory, and AIs need permission to access your data, which can be revoked.

	▲	randomNumber7 6 days ago \| parent \| next [-]
		I can easily see a world where users own the devices they buy and install the software they want, but the trend goes in the other direction.
	▲	ivape 6 days ago \| parent \| prev [-]
		in order for that lifetime companion, we'll need to make a leap in agentic memory. Well, let’s take your life. Your life is about 3 billion seconds (100 year life). That’s just 3 billion next-tokens. The thing you do on second N is just, as a whole, a next token. If next-token prediction can be scaled up such that we redefine a token from a part of language to an entire discrete event or action, then it won’t be hard for the model to just know what you will think and do … next. Memory in that case is just the next possible recall of a specific memory, or next possible action, and so on. It doesn’t actually need all the memory information, it just needs to know that that you will seek a specific memory next. Why would it need your entire database of memories if it already knows you will be looking for one exact memory next? The only thing that could explode the computational cost of this is if dynamic inputs fuck with your next token prediction. For example, you must now absolutely think about a Pink Elephant. But even that is constrained in our material world (still bounded physically, as the world can’t transfer that much information through your senses physically). A human life up to this exact moment is just a series of tokens, believe it or not. We know it for a fact because we’re bounded by time. The thing you just thought was an entire world snapshot that’s no longer here, just like an LLM output. We have not yet trained a model on human lives yet, just knowledge. We’re not done with the bitter lesson.

▲

diffeomorphism 6 days ago | parent | prev | next [-]

Basic questions: what does a GDPR request get you? Wouldn't providers like you to switch to them?

Just look at the smartphone market.

▲

lelanthran 6 days ago | parent | prev [-]

> Once those limits fall away and all of your previous conversations are made part of the context I suspect the game will change considerably, as will the players.

I dunno if this is possible; sounds like an informally specified ad-hoc statement of the halting problem.