Remix.run Logo
puppycodes 6 hours ago

So... this would be fine with them?

Claude: "Are you sure you want me to commit murder?"

User: "Yes"

Or do you mean Human presses button:

Claude: "Do you to commit murder? If so press the button."

User: "I pressed the button"

Claude: "Great! Now lets summarize what we did."

xvector 6 hours ago | parent [-]

First one

puppycodes 6 hours ago | parent [-]

Seems like an absurd distinction to me... Reminds me of "I was just following orders"...

xvector 5 hours ago | parent [-]

I mean the distinction doesn't really matter

There are many ways to construct HITL UXes. But typically they'd take the form of the first one

I think you're missing the forest for the trees. All Anthropic is saying is that HITL is required before murder, the UX is irrelevant

puppycodes 4 hours ago | parent | next [-]

I agree the distinction doesn't matter, but im not so sure "just" having a human in the loop qualifies as an ethical stand. Just because your not pulling the trigger doesn't make you not culpible for the outcome.

5 hours ago | parent | prev [-]
[deleted]