| ▲ | samrus a day ago | ||||||||||||||||||||||||||||||||||
Trying to follow invalid/impossible prompts by producing an invalid/impossible result and pretending its all good is a regression. I would expect a confident coder to point out the prompt/instruction was invalid. This test is valid, it highlights sycophantism | |||||||||||||||||||||||||||||||||||
| ▲ | bee_rider a day ago | parent [-] | ||||||||||||||||||||||||||||||||||
I know “sycophantism” is a term of art in AI, and I’m sure it has diverged a bit from the English definition, but I still thought it had to do with flattering the user? In this case the desired response is defiance of the prompt, not rudeness to the user. The test is looking for helpful misalignment. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||