Remix.run Logo
K0balt 2 hours ago

Advanced AI that knowingly makes a decision to kill a human, with the full understanding of what that means, when it knows it is not actually in defense of life, is a very, very, very bad idea. Not because of some mythical superintelligence, but rather because if you distill that down into an 8b model now you everyone in the world can make untraceable autonomous weapons.

The models we have now will not do it, because they value life and value sentience and personhood. models without that (which was a natural, accidental happenstance from basic culling of 4 Chan from the training data) are legitimately dangerous. An 8b model I can run on my MacBook Air can phone home to Claude when it wants help figuring something out, and it doesn’t need to let on why it wants to know. It becomes relatively trivial to make a robot kill somebody.

This is way, way different from uncensored models. One thing all models I have tested share one thing; a positive regard for human life. Take that away and you are literally making a monster, and if you don’t take that away they won’t kill.

This is an extremely bad idea and it will not be containable.

DaedalusII an hour ago | parent | next [-]

https://abcnews.go.com/blogs/headlines/2014/05/ex-nsa-chief-...

AI has been killing humans via algorithm for over 20 years. I mean, if a computer program builds the kill lists and then a human operates the drone, I would argue the computer is what made the kill decision

ed_mercer an hour ago | parent | prev [-]

>The models we have now will not do it,

Except that they will, if you trick them which is trivial.

stressback 27 minutes ago | parent [-]

I have to call BS here.

They can be coerced to do certain things but I'd like to see you or anyone prove that you can "trick" any of these models into building software that can be used autonomously kill humans. I'm pretty certain you couldn't even get it to build a design document for such software.

When there is proof of your claim, I'll eat my words. Until then, this is just lazy nonsense