▲ | groby_b 3 days ago | |
It's not "tired" to see if something is actually relevant in context. LLMs do not exist as marvel-qua-se, their purpose is to offload human cognitive tasks. As such, it's important if something is a commonly shared failure mode in both cases, or if it's LLM-specific. Ad absurdum: LLMs have also rapid increases of error rates if you replace more than half of the text with "Great Expectations". That says nothing about LLMs, and everything about the study - and the comparison would highlight that. No, this doesn't mean the paper should be ignored, but it does mean more rigor is necessary. |