| ▲ | mike_hearn 2 days ago | |
What field are you in? In many fields it's gross professional misconduct only in theory. This sort of thing is very common and there's never any consequence. LLM-generated citations specifically are a new problem but citations of documents that don't support the claim, contradict it, have nothing to do with it or were retracted years ago have been an issue for a long time. Gwern wrote about this here: "A major source of [false claim] transmission is the frequency with which researchers do not read the papers they cite: because they do not read them, they repeat misstatements or add their own errors, further transforming the leprechaun and adding another link in the chain to anyone seeking the original source. This can be quantified by checking statements against the original paper, and examining the spread of typos in citations: someone reading the original will fix a typo in the usual citation, or is unlikely to make the same typo, and so will not repeat it. Both methods indicate high rates of non-reading" I first noticed this during COVID and did some blogging about it. In public health it is quite common to do things like present a number with a citation, and then the paper doesn't contain that number anywhere in it, or it does but the number was an arbitrary assumption pulled out of thin air rather than the empirical fact it was being presented as. It was also very common for papers to open by saying something like, "Epidemiological models are a powerful tool for predicting the spread of disease" with eight different citations, and every single citation would be an unvalidated model - zero evidence that any of the cited models were actually good at prediction. Bad citations are hardly the worst problem with these fields, but when you see how widespread it is and that nobody within the institutions cares it does lead to the reaction you're having where you just throw your hands up and declare whole fields to be writeoffs. | ||
| ▲ | TomasBM 2 days ago | parent [-] | |
The abuse of claims and citations is a legitimate and common problem. However, I think hallucinated citations pose a bigger problem, because they're fundamentally a lie by commission instead of omission, misinterpretation or misrepresentation of facts. At the same time, it may be an accidental lie, insofar authors mistakenly used LLMs as search engines, just to support a claim that's commonly known, or that they remember well but can't find the origin of. So, unless we reduce the pressure on publication speed, and increase the pressure for quality, we'll need to introduce more robust quality checks into peer review. | ||