The code has a stated goal of avoiding leaks, but then the actual implementation becomes broader than that. I see two possible explanations:

* The authors made the code very broad to improve its ability to achieve the stated goal

* The authors have an unstated goal

I think it's healthy to be skeptical but what I'm seeing is that the skeptics are pushing the boundaries of what's actually in the source. For example, you say "says on the tin" that it "pretends to be human" but it simply does not say that on the tin. It does say "Write commit messages as a human developer would" which is not the same thing as "Try to trick people into believing you're human." To convince people of your skepticism, it's best to stick to the facts.

▲

mzajc 5 hours ago | parent | next [-]

By "says on the tin," I was referring to the name ("undercover mode") and the instruction to "not blow your cover." If pretending to be a human is not the cover here, what is? Additionally, does Claude code still admit that it's a LLM when this prompt is active as you suggest, or does it pretend to be a human like the prompt tells it to?

▲

cat_plus_plus 4 hours ago | parent | prev [-]

Why are you assuming the actual implementation was authored by a human?

	▲	peacebeard 4 hours ago \| parent [-]
		My comment makes no such assumption.