| ▲ | kostaj 2 hours ago | |||||||
Two of the five models used (Gemini+Search and Sonar Pro) have retrieval capabilities and used search when classifying the claims. The disagreement between them is still quite significant - 42%. | ||||||||
| ▲ | simonw 2 hours ago | parent [-] | |||||||
Here are those disagreements: https://lite.datasette.io/?csv=https%3A%2F%2Fstatic.simonwil... One example: Researchers estimate that the average person ingests about 5 grams of plastic per week, which is approximately the weight of a credit card. Gemini retrieval: Misleading Sonar pro: Mostly True | ||||||||
| ||||||||