Remix.run Logo
dvkramer 7 hours ago

Yeah that's how I feel too. Flash is less verbose and every LLM nowadays seems to be designed by some low-taste people who reward the model for falsely hedging (i.e. "The 2024 Corolla Cross usually has an X gallon gas tank") on stuff that isn't at all variable or questionable. This false hedging is way more of an issue than hallucinations in my experience and the "smarter" 2.5 Pro is not any better at avoiding this issue than Flash

Also 2.5 Pro is often incapable of searching and will hallucinate instead. I don't know why. It will claim it searched and then return some made up results instead. 2.5 Flash is much more consistently capable of searching