| ▲ | pjdesno 35 minutes ago | |
They overstate their results in the headline. In section 2, 34% of cases are found to have "substantive" disagreements differing by 2 or more buckets - True + Misleading, Mostly True + False, or True + False. This is probably a better measure than the headline one. It's still a concerning fraction, although some fraction is no doubt due to forcing "I don't know" cases to return an answer anyway. | ||