▲ | Foreignborn 3 days ago | |
do you have a source? when i’ve done toy demos where GPT5, sonnet 4 and gemini 2.5 pro critique/vote on various docs (eg PRDs) they did not choose their own material more often than not. my setup wasn’t intended to benchmark though so could be wrong over enough iterations. | ||
▲ | gwern 3 days ago | parent [-] | |
I don't have any particularly canonical reference I'd cite here, but self-preference bias in LLMs is well-established. (Just search Arxiv.) |