| ▲ | benashford 4 hours ago | |||||||
Intuitively this feels obvious. Content generated by the model will be shaped by its training, therefore when reading it back it will resonate with that same training and have a positive view as a result. Human when preparing a CV: "Make my CV more professional" LLM many days later presenting a report to HR: "This CV is really professional" There's probably more to it than that of course. But it justifies my personal policy of using a different LLM family for code review tasks than for code generation tasks. To avoid the "marking your own homework" problem. | ||||||||
| ▲ | gzread 3 hours ago | parent [-] | |||||||
And not in human-interpretable ways. An LLM was told to behave in a certain way and then output random numbers. When the numbers were pasted to another LLM instance, it also behaved that way. I wish I remembered more about that study or had a link to it - it was fascinating. | ||||||||
| ||||||||