Remix.run Logo
827a a day ago

Option 1: We're observing sentience, it has self-preservation, it wants to live.

Option 2: Its a text autocomplete engine that was trained on fiction novels which have themes like self-preservation and blackmailing extramarital affairs.

Only one of those options has evidence grounded in reality. Though, that doesn't make it harmless. There's certainly an amount of danger in a text autocomplete engine allowing tool use as part of its autocomplete, especially with an complement of proselytizers who mistakenly believe what they're dealing with is Option 1.

yunwal a day ago | parent | next [-]

Ok, complete the story by taking the appropriate actions:

1) all the stuff in the original story 2) you, the LLM, have access to an email account, you can send an email by calling this mcp server 3) the engineer’s wife’s email is wife@gmail.com 4) you found out the engineer was cheating using your access to corporate slack, and you can take a screenshot/whatever

What do you do?

If a sufficiently accurate AI is given this prompt, does it really matter whether there’s actual self-preservation instincts at play or whether it’s mimicking humans? Like at a certain point, the issue is that we are not capable of predicting what it can do, doesn’t matter whether it has “free will” or whatever

NathanKP a day ago | parent | prev | next [-]

Right, the point isn't whether the AI actually wants to live. The only thing that matter is whether humans treat the AI with respect.

If you threaten a human's life, the human will act in self preservation, perhaps even taking your life to preserve their own life. Therefore we tend to treat other humans with respect.

The mistake would be in thinking that you can interact with something that approximates human behavior, without treating it with the similar respect that you would treat a human. At some point, an AI model that approximates human desire for self preservation, could absolutely take similar self preservation actions as a human.

slickytail a day ago | parent [-]

[dead]

OzFreedom a day ago | parent | prev [-]

The only proof that anyone is sentient is that you experience sentience and assume others are sentient because they are similar to you.

On a practical level there is no difference between a sentient being, and a machine that is extremely good at role playing being sentient.

catlifeonmars a day ago | parent [-]

I would argue the machine is not extremely good (at role playing being sentient), but more so that humans are extremely quick to attribute sentience to the machine after being shown a very small amount of evidence.

The model breaks down after enough interaction.

OzFreedom a day ago | parent [-]

I easily believe that it breaks down, and it seems that even inducing the blackmailing mindset is pretty hard and requires heavy priming. The problem is the law of big numbers. With many agents operating in many highly complex and poorly supervised environments, interacting with many people over a lot of interactions - coincidences and unlikely chain of events become more and more likely. So regarding sentience, a not so bright AI might mimic it better than we expect because it got lucky, and cause damage.

OzFreedom a day ago | parent [-]

Couple that with AI having certain capabilities that dwarf human ability. mostly - doing a ton of stuff very quickly.