Remix clone Hacker News

new | show | ask | jobs Github

	▲	brap 8 hours ago
		Am I the only one who is starting to feel the Gemini Flash models are better than Pro? Flash is super fast, gets straight to the point. Pro takes ages to even respond, then starts yapping endlessly, usually confuses itself in the process and ends up with a wrong answer.
	▲	gnulinux 8 hours ago \| parent \| next [-]
		This is not my experience. In my experience Gemini 2.5 Pro is the best model in every use-case I tried. There are a few very hard (graduate level) logic or math problems that Claude 4.1 Opus edged-out over Gemini 2.5 Pro, but in general if you have no idea which model will perform best on a difficult question, imho Gemini 2.5 Pro is a safer bet especially since it's significantly cheaper. Gemini 2.5 Flash is really good but imho not nearly as good as Pro in (1) research math (2) creative/artistic writing (3) open ended programming debugging. On the other hand, I do prefer using Claude 4 Sonnet on very open-ended agentic programming tasks because it seems to have a better integration with VSCode Copilot. Gemini 2.5 Pro bugs out much more often where Claude works fine almost every time.
	▲	dvkramer 7 hours ago \| parent \| prev \| next [-]
		Yeah that's how I feel too. Flash is less verbose and every LLM nowadays seems to be designed by some low-taste people who reward the model for falsely hedging (i.e. "The 2024 Corolla Cross usually has an X gallon gas tank") on stuff that isn't at all variable or questionable. This false hedging is way more of an issue than hallucinations in my experience and the "smarter" 2.5 Pro is not any better at avoiding this issue than Flash Also 2.5 Pro is often incapable of searching and will hallucinate instead. I don't know why. It will claim it searched and then return some made up results instead. 2.5 Flash is much more consistently capable of searching
	▲	selimthegrim 8 hours ago \| parent \| prev [-]
		I tried to put Pro deep research on an actual research task and it didn’t even return anything just kept on working.