▲ | agobineau 5 hours ago | ||||||||||||||||
i found it more interesting to consider through the perception of self-honesty or self-deception. or in this case, the llm inadvertently trained to conceal its intent to the user and rather to condition the user to the conclusion it truly wants rather than to answer directly | |||||||||||||||||
▲ | kennywinker 3 hours ago | parent [-] | ||||||||||||||||
Right, like for example - if you ask an llm about islamic cultural practices it could mention “ketman”, instead of just calling them scheming liars. It’d be awful if llms were able to conceal their true intent like that. | |||||||||||||||||
|