Remix.run Logo
whynotminot 14 hours ago

Gemini models also consistently hallucinate way more than OpenAI or anthropic models in my experience.

Just an insane amount of YOLOing. Gemini models have gotten much better but they’re still not frontier in reliability in my experience.

usaar333 12 hours ago | parent | next [-]

True, but it gets you higher accuracy. Gemini had the best aa-omniscience score

https://artificialanalysis.ai/evaluations/omniscience

cubefox 13 hours ago | parent | prev [-]

In my experience, when I asked Gemini very niche knowledge questions, it did better than GPT-5.1 (I assume 5.2 is similar).

whynotminot 5 hours ago | parent [-]

Don’t get me wrong Gemini 3 is very impressive! It just seems to always need to give you an answer, even if it has to make it up.

This was also largely how ChatGPT behaved before 5, but OpenAI has gotten much much better at having the model admit it doesn’t know or tell you that the thing you’re looking for doesn’t exist instead of hallucinating something plausible sounding.

Recent example, I was trying to fetch some specific data using an API, and after reading the API docs, I couldn’t figure out how to get it. I asked Gemini 3 since my company pays for that. Gemini gave me a plausible sounding API call to make… which did not work and was completely made up.