| ▲ | joecarpenter 5 days ago | |
Isn't it the opposite? From the link: Scores range from -100 to 100, where 0 means as many correct as incorrect answers, and negative scores mean more incorrect than correct. Gemini 3 Flash scored +13 in the test, more correct answers than incorrect. | ||
| ▲ | sabareesh 5 days ago | parent | next [-] | |
Nope lower is better compared to recent open ai models this is bad. I am looking at AA-Omniscience Hallucination Rate | ||
| ▲ | nemonemo 5 days ago | parent | prev [-] | |
One thing I don't understand is how come Gemini Pro seems much cheaper than Gemini Flash in the scatter graph. | ||