Remix.run Logo
kgeist a day ago

>The model uses this internal error signal (the gradient) as a mathematical equivalent of saying, "This is unexpected and important!" This allows the Titans architecture to selectively update its long-term memory only with the most novel and context-breaking information

So one can break a model by consistently feeding it with random, highly improbable junk? Everything would be registered as a surprise and get stored, impacting future interactions

andy12_ a day ago | parent | next [-]

This is an oversimplification of what Titans does. The model performs nested learned, where the model learns during inference, and during training the model weights learn _how and what_ to learn during inference. If the input contains junk of irrelevant information, the model most likely learned during training to assign low surprise query and key embeddings to those tokens, because learning those junk tokens would have hurt the overall ability of the model to predict subsequent next tokens (and thus, it would have had increased the training loss).

pmichaud a day ago | parent | prev | next [-]

I’m guessing that this is the first thing they thought of and the problem only exists in the superficial gloss you’re responding to?

bethekidyouwant a day ago | parent | prev | next [-]

In what world can you not always break the response of an AI by feeding it a bunch of random junk?

xnx a day ago | parent | next [-]

Indeed. In what world can you not break any tool when deliberately misusing it?

lacoolj 6 hours ago | parent [-]

BRB getting an anvil

kgeist a day ago | parent | prev | next [-]

I mean, currently LLMs are stateless and you can get rid of all the poisoned data by just starting a new conversation (context). And OP introduces "long-term memory" where junk will accumulate with time

soerxpso a day ago | parent | next [-]

I believe you're misunderstanding what the OP means about "long-term" memory. From what I can tell, it's not actively modifying the weights of the underlying model, it just "remembers" things from a high number of tokens into the past of its context. The point is that this allows it to remember something it read ~200 pages ago in a very long context window, not that it can remember something from one session into another clean session.

AlexCoventry a day ago | parent [-]

This model has fast weights, which actually are modified during inference.

energy123 a day ago | parent [-]

Marketplace for fast weights inbound

dmix a day ago | parent | prev [-]

In something like Cursor if it messes something up your can click 'undo'. I'd imagine a small snapshot would only persisted to the memory if you keep it's output and even then it's mostly just a summary.

There's probably lots of small signals of "the user is happy with the output" plus the longer the history the more it will converge on the middle of being what you want. Including when the user says "don't do [x]" which override past stuff.

CooCooCaCha a day ago | parent | prev [-]

I mean ideally AI would be resilient to junk, don't you think?

vlovich123 a day ago | parent | next [-]

Humans are pretty vulnerable to junk so I’m not sure.

amarant a day ago | parent | prev [-]

Ideally, you'd run your own instance of this, I think.

I can see a product where you purchase a model that has basic training, and then, using the features outlined in the paper, it learns on the fly from your usage.

I can also see there being a secondary market for specially trained models, long-term memory filled with some specific skill, done in some specific way. To make a silly example, imagine buying a licence to Torvald's OS coding assistant, ready to insult your prs before you even commit them!(And possibly help you write code in Torvald's style too)

This would of course require Linus to use the model enough for it to learn,I won't comment on the likelihood of that happening: it's just a silly example after all

idiotsecant a day ago | parent | prev | next [-]

The is the start of what I always thought an AI should have - a limbic system. Humans don't store memory based on novelty, they store it based on emotional content. This is where I was afraid of the tiger, this is where I smelled delicious food, this was what it felt like when I was victorious in the hunt.

AI needs an internal emotional state because that's what drives attention and memory. AI needs to want something.

luckydata a day ago | parent | next [-]

That would be the biggest mistake anyone could do. I hope nobody goes down this route. AI "wanting" things are an enormous risk to alignment.

pixl97 a day ago | parent | next [-]

I mean setting any neural net with a 'goal' is really just defining a want/need. You can't just encode the entire problemspace of reality, you have to give the application something to filter out.

idiotsecant a day ago | parent | prev [-]

At some point I think we'll have to face the idea that any AI more intelligent than ourselves will by definition be able to evade our alignment tricks.

luckydata a day ago | parent [-]

equating more intelligent to "wanting things" is a fallacy. You can have a hyper intelligent computer that simply waits for you to ask it to do a job, or you can endow it with the digital equivalent of hunger and reproductive instincts and it will behave completely differently.

We would be INSANE to pursue giving that type of instincts to AIs.

drdeca 18 hours ago | parent | next [-]

For some senses of “wanting things”, I think it might be hard to make a powerful AI that couldn’t be easily modified to produce one that “wants things” in some sense.

So, if it would be bad thing for one to be made that “wants things” in any reasonable sense of the phrase, then it would probably be bad for J Random to be able to take a copy of a powerful AI and modify it in some way, because someone is likely to try doing that.

Of course, perhaps the best way to make sure that J Random doesn’t have the ability to do that, is to make sure no one does.

sayamqazi 14 hours ago | parent | prev [-]

You are making a claim that "Intelligenece" is separable from other things found in humans and other animals. There is no proof or example supporting this.

I have come to beleive that we will only be able to truly replicate intelligence if the system was trying to preserve itself. Its the biggest incentive ever to do intelligent things.

red75prime 11 hours ago | parent | prev [-]

...this is where I randomly decided to remember this particular day of my life. Yep, I indeed did it because why not. No, it didn't work particularly well, but I do remember some things about that day.

I mean it's not just automatic thing with no higher-level control.

falcor84 10 hours ago | parent | prev | next [-]

I read that this works on humans too. Minds can break.

photochemsyn a day ago | parent | prev [-]

This is no different from what happens to humans if they're locked into cult programming situations, they'll start believing and regurgitating all kinds of nonsense if their information stream is tightly curated,

Practically, for use with a codebase development effort, if the model remembers the original design decisions, the discussions about costs and benefits, then can remember all that much later in the process, it's going to start getting really good at thinking about what the next step is, or even to make decisions about when a major refactor is neede, etc.