| ▲ | 4bpp 5 hours ago | ||||||||||||||||
If you are so allergic to using terms previously reserved for animal behaviour, you can instead unpack the definition and say that they produce outputs which make human and algorithmic observers conclude that they did not instantiate some undesirable pattern in other parts of their output, while actually instantiating those undesirable patterns. Does this seem any less problematic than deception to you? | |||||||||||||||||
| ▲ | surgical_fire 4 hours ago | parent [-] | ||||||||||||||||
> Does this seem any less problematic than deception to you? Yes. This sounds a lot more like a bug of sorts. So many times when using language models I have seem answers contradicting answers previously given. The implication is simple - They have no memory. They operate upon the tokens available at any given time, including previous output, and as information gets drowned those contradictions pop up. No sane person should presume intent to deceive, because that's not how those systems operate. By calling it "deception" you are actually ascribing intentionality to something incapable of such. This is marketing talk. "These systems are so intelligent they can try to deceive you" sounds a lot fancier than "Yeah, those systems have some odd bugs" | |||||||||||||||||
| |||||||||||||||||