| ▲ | scrollop 5 days ago | |||||||||||||
Also https://artificialanalysis.ai/evaluations/omniscience Prepare to be amazed | ||||||||||||||
| ▲ | albumen 5 days ago | parent | next [-] | |||||||||||||
I’m amazed by how much Gemini 3 flash hallucinates; it performs poorly in that metric (along with lots of other models). In the Hallucination Rate vs. AA-Omniscience Index chart, it’s not in the most desirable quadrant; GPT-5.1 (high), opus 4.5 and 4.5 haiku are. Can someone explain how Gemini 3 pro/flash then do so well then in the overall Omniscience: Knowledge and Hallucination Benchmark? | ||||||||||||||
| ||||||||||||||
| ▲ | andy12_ 4 days ago | parent | prev [-] | |||||||||||||
I'm confused about the "Accuracy vs Cost" section. Why is Gemini 3 Pro so cheap? It's basically the cheapest model in the graph (sans Llama 4 and Mistral Large 3) by a wide margin, even compared to Gemini 3 Flash. Is that an error? | ||||||||||||||
| ||||||||||||||