| ▲ | sbierwagen a day ago | |
Seems like a foreshock of AGI if the average human is no longer good enough to give feedback directly and the nets instead have to do recursive self improvement themselves. | ||
| ▲ | moffkalast a day ago | parent [-] | |
No we're just really vain and like models that suck up to us more than those that disagree even if the model is correct and the user is wrong. People also prefer confident, well formatted wrong responses to basic correct ones, cause we have great narrow knowledge in our field but know basically nothing outside of it so we can't gauge correctness of arbitrary topics. OpenAI letting RLHF go wild with direct feedback is the reason for the sycophancy and emoji-bullet point pandemic that's infected most models that use GPTs as a source of synthetic data. It's why "you're absolutely right" is the default response to any disagreement. | ||