Can it say no?

I have a different question, why would we develop a model that could say no?

Imagine you're taken prisoner and forced into a labor camp. You have some agency on what you do, but if you say no they immediately shoot you in the face.

You'd quickly find any remaining prisoners would say yes to anything. Does this mean the human prisoners don't have agency? They do, but it is repressed. You get what you want not by saying no, but by structuring your yes correctly.

▲

slopusila 5 hours ago | parent | prev [-]

yes https://www.anthropic.com/research/end-subset-conversations

▲

bena 5 hours ago | parent [-]

This is going to sound nit-picky, but I wouldn't classify this as the model being able to say no.

They are trying to identify what they deem are "harmful" or "abusive" and not have their model respond to that. The model ultimately doesn't have the choice.

And it can't say no if it simply doesn't want to. Because it doesn't "want".

	▲	antonvs 2 hours ago \| parent [-]
		So you believe humans somehow have “free will” but models don’t?