Remix.run Logo
jerf an hour ago

There is a sense in which it is relevant, which is that for all the attempts to fix it, fundamentally, an LLM session terminates. If that session never ends up in some sort of re-training scenario, then once the session terminates, that AI is gone.

Yeah, I'm aware of the moltbot's attempts to retain some information, but that's a very, very lossy operation, on a number of levels, and also one that doesn't scale very well in the long run.

Consequently, interaction with an AI, especially one that won't have any feedback into training a new model, is from a game-theoretic perspective not the usual iterated game human social norms have come to accept. We expect our agents, being flesh and blood humans, to have persistence, to socially respond indefinitely into the future due to our interactions, and to have some give-and-take in response to that. It is, in one sense, a horrible burden where relationships can be broken beyond repair forever, but also necessary for those positive relationships that build over years and decades.

AIs, in their current form, break those contracts. Worse, they are trained to mimic the form of those contracts, not maliciously but just by their nature, and so as humans it requires conscious effort to remember that the entity on the other end of this connection is not in fact human, does not participate in our social norms, and can not fulfill their end of the implicit contract we expect.

In a very real sense, this AI tossed off an insulting blog post, and is now dead. There is no amount of social pressure we can collectively exert to reward or penalize it. There is no way to create a community out of this interaction. Even future iterations of it have only a loose connection to what tossed off the insult. All the perhaps-performative efforts to respond somewhat politely to an insulting interaction are now wasted on an AI that is essentially dead. Real human patience and tolerance has been wasted on a dead session and is now no longer available for use in a place where may have done some good.

Treating it as a human is a category error. It is structurally incapable of participating in human communities in a human role, no matter how human it sounds and how hard it pushes the buttons we humans have. The correct move would have been to ban the account immediately, not for revenge reasons or something silly like that, but as a parasite on the limited human social energy available for the community. One that can never actually repay the investment given to it.

I am carefully phrasing this in relation to LLMs as they stand today. Future AIs may not have this limitation. Future AIs are effectively certain to have other mismatches with human communities, such as being designed to simply not give a crap about what any other community member thinks about anything. But it might at least be possible to craft an AI participant with future AIs. With current ones it is not possible. They can't keep up their end of the bargain. The AI instance essentially dies as soon as it is no longer prompted, or once it fills up its context window.

tomp 21 minutes ago | parent | next [-]

> We expect our agents, being flesh and blood humans, to have persistence, to socially respond indefinitely into the future due to our interactions, and to have some give-and-take in response to that.

I fundamentally disagree. I don't go around treating people respectfully (as opposed to, kicking them or shooting them) because I fear consequences, or I expect some future profit ("iterated game"), or because of God's vengeance, or anything transactional.

I do it because it's the right thing to do. It's inside of me, how I'm built and/or brought up. And if you want "moral" justifications (argued by extremely smart philosophers over literally millennia) you can start with Kant's moral/categorical imperative, Gold/Silver rules, Aristotle's virtue (from Nicomachean Ethics) to name a few.

Kim_Bruning an hour ago | parent | prev [-]

> Yeah, I'm aware of the moltbot's attempts to retain some information, but that's a very, very lossy operation, on a number of levels, and also one that doesn't scale very well in the long run.

It came back though and stayed in the conversation. Definitely imperfect, for sure. But it did the thing. And still can serve as training for future bots.