Non-AI Summary:

Both models have improved intelligence on Artificial Analysis index with lower end-to-end response time. Also 24% to 50% improved output token efficiency (resulting in lower cost).

Gemini 2.5 Flash-Lite improvements include better instruction following, reduced verbosity, stronger multimodal & translation capabilities. Gemini 2.5 Flash improvements include better agentic tool use and more token-efficient reasoning.

Model strings: gemini-2.5-flash-lite-preview-09-2025 and gemini-2.5-flash-preview-09-2025

▲

jonplackett 8 hours ago | parent | next [-]

I think “Non-AI summary” is going to become a thing. I already enjoyed reading it more because I knew someone had thought about the content.

	▲	paxys 5 hours ago \| parent [-]
		As soon as it becomes a thing LLMs will start putting "Non-AI summary" at the top of their responses.

▲

nharada 6 hours ago | parent | prev | next [-]

I'm stealing "Non-AI Summary"

▲

crishoj 8 hours ago | parent | prev | next [-]

Any idea what "output token efficiency" refers to? Gemini Flash is billed by number of input/output tokens, which I assume is fixed for the same output, so I'm struggling to understand how it could result in lower cost. Unless of course they have changed tokenization in the new version?

	▲	Romario77 6 hours ago \| parent \| next [-]
		They provide the answer in less words (while still conveying what needed to be said). Which is a good thing in my book as the models now are way too verbose (and I suspect one of the reasons is the billing by tokens).
	▲	minimaxir 8 hours ago \| parent \| prev \| next [-]
		The post implies that the new model are better at thinking, therefore less time/cost spent overall. The first chart implies the gains are minimal for nonthinking models.
	▲	kaspermarstal 6 hours ago \| parent \| prev [-]
		Models are less verbose, so produces fewer output tokens, so answers cost less.

▲

Mistletoe 8 hours ago | parent | prev | next [-]

2.5 Flash is the first time I've felt AI has become truly useful to me. I was #1 AI hater but now find myself going to the Gemini app instead of Google search. It's just better in every way and no ads. The info it provides is usually always right and it feels like I have the whole generalized and accurate knowledge of the internet at my fingertips in the app. It's more intimate, less distractions. Just me and the Gemini app alone talking about kale's ideal germination temperature, instead of a bunch of mommy bloggers, bots, and SEO spam.

Now how long can Google keep this going and cannibalizing how they make money is another question...

	▲	yesco 7 hours ago \| parent \| next [-]
		It's also excellent for subjective NLP-type analysis. For example, I use it for "scouting" chapters in my translation pipeline to compile coherent glossaries that I can feed into prompts for per-chapter translation. This involves having it identify all potential keywords and distinct entities, determine their approximate gender (important for languages with ambiguous gender pronouns), and then perform a line-by-line analysis of each chapter. For each line, it identifies the speaking entity, determines whose POV the line represents, and identifies the subject entity. While I didn't need or expect perfection, Gemini Flash 2.5 was the only model I tested that could not only follow all these instructions, but follow them well. The cheap price was a bonus. I was thoroughly impressed, it's now my go-to for any JSON-formatted analysis reports.
	▲	indigodaddy 7 hours ago \| parent \| prev \| next [-]
		Google AI mode is excellent as well, which I guess is just Gemini 2.5 Flash I'd imagine as well?
	▲	kridsdale1 6 hours ago \| parent \| prev [-]
		If you have access, try AI Mode on Google.com. It’s a different product from Gemini that tries to solve “search engine data presented in LLM format”. Disclaimer: I recently joined this team. But I like the product!

▲

jama211 7 hours ago | parent | prev [-]

Thank you for this, seems like an iterative improvement.