Remix.run Logo
littlestymaar a day ago

> I personally can't identify anything that reads "act maliciously" or in a character that is malicious.

Because you haven't been trained of thousands of such story plots in your training data.

It's the most stereotypical plot you can imagine, how can the AI not fall into the stereotype when you've just prompted it with that?

It's not like it analyzed the situation out of a big context and decided from the collected details that it's a valid strategy, no instead you're putting it in an artificial situation with a massive bias in the training data.

It's as if you wrote “Hitler did nothing” to GPT-2 and were shocked because “wrong” is among the most likely next tokens. It wouldn't mean GPT-2 is a Nazi, it would just mean that the input matches too well with the training data.

hoofedear a day ago | parent | next [-]

That's a very good point, like the premise does seem to beg the stereotype of many stories/books/movies with a similar plot

whodatbo1 a day ago | parent | prev | next [-]

The issue here is that you can never be sure how the model will react based on an input that is seemingly ordinary. What if the most likely outcome is to exhibit malevolent intent or to construct a malicious plan just because it invokes some combination of obscure training data. This just shows that models indeed have the ability to act out, not under which conditions they reach such a state.

Spooky23 a day ago | parent | prev [-]

If this tech is empowered to make decisions, it needs to prevented from drawing those conclusions, as we know how organic intelligence behaves when these conclusions get reached. Killing people you dislike is a simple concept that’s easy to train.

We need an Asimov style laws of robotics.

a day ago | parent | next [-]
[deleted]
seanhunter a day ago | parent | prev | next [-]

That's true of all technology. We put a guard on chainsaws. We put robotic machining tools into a box so they don't accidentally kill the person who's operating them. I find it very strange that we're talking as though this is somehow meaningfully different.

Spooky23 12 hours ago | parent [-]

It’s different because you have a decision engine that is generally available. The blade guard protects the user from inattention… not the same as an autonomous chainsaw that mistakes my son for a tree.

Scaled up, technology like guided missiles is locked up behind military classification. The technology is now generally available to replicate many of the use cases of those weapons, assessable to anyone with a credit card.

Discussions about security here often refer to Thompson’s “Reflections on Trusting Trust”. He was reflecting on compromising compilers — compilers have moved up the stack and are replacing the programmer. As the required skill level of a “programmer” drops, you’re going to have to worry about more crazy scenarios.

eru a day ago | parent | prev [-]

> We need an Asimov style laws of robotics.

The laws are 'easy', implementing them is hard.

chuckadams a day ago | parent [-]

Indeed, I, Robot is made up entirely of stories in which the Laws of Robotics break down. Starting from a mindless mechanical loop of oscillating between one law's priority and another, to a future where they paternalistically enslave all humanity in order to not allow them to come to harm (sorry for the spoilers).

As for what Asimov thought of the wisdom of the laws, he replied that they were just hooks for telling "shaggy dog stories" as he put it.