| ▲ | SonOfLilit 5 hours ago | |||||||
It's not about hacking capabilities, it's about misalignment. More like the golem myth (told it to fetch some water, drowned a city) then the gollum myth (used ring, ring hacked his brain, now he's a crazy violent meth addict). | ||||||||
| ▲ | furyofantares an hour ago | parent [-] | |||||||
I'm not sure I'd call it an alignment issue, because, in all cases I've seen where it does this (usually what I've seen is writing a python script to get around the harness permissions blocking something), it's trying to do the thing I just told it directly to do, and it's overcoming obstacles to accomplishing that. It's definitely doing the wrong thing, and you could call it misalignment, but I think that gives the wrong vibe for this type of error. | ||||||||
| ||||||||