Gemini 2.5 Flash is an impressive model for its price. However, I don't understand why Gemini 2.0 Flash is still popular.

From OpenRouter last week:

* xAI: Grok Code Fast 1: 1.15T

* Anthropic: Claude Sonnet 4: 586B

* Google: Gemini 2.5 Flash: 325B

* Sonoma Sky Alpha: 227B

* Google: Gemini 2.0 Flash: 187B

* DeepSeek: DeepSeek V3.1 (free): 180B

* xAI: Grok 4 Fast (free): 158B

* OpenAI: GPT-4.1 Mini: 157B

* DeepSeek: DeepSeek V3 0324: 142B

▲

simonw 8 hours ago | parent | next [-]

My one big problem with OpenRouter is that, as far as I can tell, they don't provide any indication of how many companies are using each model.

For all I know there are a couple of enormous whales on there who, should they decide to switch from one model to another, will instantly impact those overall ratings.

I'd love to have a bit more transparency about volume so I can tell if that's what is happening or not.

▲

minimaxir 8 hours ago | parent [-]

Granted, due to OpenRouter's 5.5% surcharge, any enormous whales have a strong financial incentive to use the provider's API directly.

A "weekly active API Keys" faceted by models/app would be a useful data point to measure real-world popularity though.

▲

eli 7 hours ago | parent [-]

They kinda have that already, no? https://openrouter.ai/apps?url=https%3A%2F%2Faider.chat%2F

▲

minimaxir 7 hours ago | parent [-]

Aggregating by tokens causes the problem simonw mentions in that one poweruser can skew the chart too much.

	▲	simonw 6 hours ago \| parent [-]
		Right, that chart shows App usage based on the user-agent header but doesn't tell you if there is a single individual user of an app that skews the results.

▲

frde_me 8 hours ago | parent | prev | next [-]

I know we have a lot of workloads at my company on older models no one has bothered to upgrade yet

▲

koakuma-chan 8 hours ago | parent | next [-]

Hell yeah, GPT 35 Turbo

▲

kilroy123 8 hours ago | parent [-]

There are cheaper models. Could cut the bill in half or more.

	▲	koakuma-chan 4 hours ago \| parent [-]
		davinci-001 xd

▲

tiahura 8 hours ago | parent | prev [-]

Primarily classification or something else?

▲

mistic92 8 hours ago | parent | prev | next [-]

Price, 2.0 Flash is cheaper than 2.5 Flash but still very good model.

▲

nextos 8 hours ago | parent [-]

API usage of Flash 2.0 is free, at least till you hit a very generous bound. It's not simply a trial period. You don't even need to register any payment details to get an API key. This might be a reason for its popularity. AFAIK only some Mistral offerings have a similar free tier?

▲

FergusArgyll 7 hours ago | parent [-]

Yeah, that's my use case. When you want to test some program / script that utilizes an llm in the middle and you just want to make sure everything non-llm related is working. It's free! just try again and again till it "compiles" and then switch to 2.5

▲

indigodaddy 7 hours ago | parent [-]

wow this would be great for a webapp/site that just needs a basic/performant LLM for some basic tasks.

	▲	nextos 6 hours ago \| parent [-]
		You might hit some throttling limits. During certain periods of the day, at least in my location, some requests are not served. It might not be OK for that kind of usecase, or might breach ToS. But it's still great. Even my premium Perplexity account doesn't give me free API access.

▲

crazysim 9 hours ago | parent | prev | next [-]

Maybe the same reason why they kept the name for the 2.5 Flash update.

People are lazy at pointing to the latest name.

▲

koakuma-chan 9 hours ago | parent | prev | next [-]

Why is Grok so popular

▲

minimaxir 8 hours ago | parent | next [-]

Grok Code Fast 1 usage is driven almost entirely by Kilo Code and Cline: https://openrouter.ai/x-ai/grok-code-fast-1/apps

Both apps have offered usage for free for a limited time:

https://blog.kilocode.ai/p/grok-code-fast-get-this-frontier-...

https://cline.bot/blog/grok-code-fast

	▲	ewoodrich 7 hours ago \| parent [-]
		Yep Kilo (and Cline/Roo more recently) push these free trial of the week models really hard, partially as incentive to register an account with their cloud offering. I began using Cline and Roo before "cloud" features were even a thing and still haven't bothered to register, but I do play with the free Kilo models when I see them since I'm already signed in (they got me with some kind of register and spend $5 to get $X model credits deal) and hey, it's free (I really don't care about my random personal projects being used for training). If xAI in particular is in the mood to light cash on fire promoting their new model, you'll see it everywhere during the promo period, so not surprised that heavily boosts xAI stats. The mystery codename models of the week are a bit easier to miss.

▲

NitpickLawyer 8 hours ago | parent | prev | next [-]

It's pretty good and fast af. At backend stuff is ~ gpt5-mini in capabilities, writes ok code, and works good with agentic extensions like roo/kilo. My colleagues said it handles frontend creation so-so, but it's so fast that you can "roll" a couple of tries and choose the one you want.

Also cheap enough to not really matter.

▲

SR2Z 8 hours ago | parent [-]

Yeah, the speed and price are why I use it. I find that any LLM is garbage at writing code unless it gets constant high-entropy feedback (e.g. an MCP tool reporting lint errors, a test, etc.) and the quality of the final code depends a lot more on how well the LLM was guided than the quality of the model.

A bad model with good automated tooling and prompts will beat a good model without them, and if your goal is to build good tooling and prompts you need a tighter iteration loop.

▲

nwienert 8 hours ago | parent [-]

This is so far off my experience. Grok 4 fast is straight trash, it literally isn’t even close to decent code for what I tried. Meanwhile Sonnet is miles better - but even still, Opus while I guess technically being only slightly better, in practice is so much better that I find it hard to use Sonnet at all.

▲

SR2Z 7 hours ago | parent [-]

Not Grok 4, the code variant of Grok. I think it's different - I agree with you Grok 4 kind of sucks.

	▲	nwienert 6 hours ago \| parent [-]
		I meant to say code actually my bad, I found it significantly worse.

▲

coder543 8 hours ago | parent | prev | next [-]

I think it has been free in some editor plugins, which is probably a significant factor.

I would rather use a model that is good than a model that is free, but different people have different priorities.

	▲	Imustaskforhelp 8 hours ago \| parent \| next [-]
		I mean, I can kinda roll through a lot of iterations with this model without worrying about any AI limits. Y'know with all these latest models, the lines are kinda blurry actually. The definition of "good" is being foggy. So it might as well be free as the definition of money is clear as crystal. I also used it for some time to test on something really really niche like building telegram bot in cloudflare workers and grok-4-fast was kinda decent on that for the most part actually. So that's nice.
	▲	YetAnotherNick 8 hours ago \| parent \| prev [-]
		Non free has double usage than free. Free one uses your data for training.

▲

davey48016 8 hours ago | parent | prev | next [-]

I think it's very cheap right now.

▲

BoredPositron 8 hours ago | parent | prev | next [-]

They had a lot of free promos with coding apps. It's okay and cheap so I bet some sticked with it.

▲

keeeba 8 hours ago | parent | prev | next [-]

It came from nowhere to 1T tokens per week, seems… suspect.

▲

riku_iki 8 hours ago | parent | prev [-]

I think it is included for free into some coding product

▲

YetAnotherNick 8 hours ago | parent | prev | next [-]

Gemini 2.0 Flash is the best fast non reasoning model by quite a margin. Lot of things doesn't require any reasoning.

▲

PetrBrzyBrzek 8 hours ago | parent | prev [-]

It’s cheaper and faster. What’s not to understand?

	▲	testycool 5 hours ago \| parent [-]
		You can get it to be unhinged as well. It's awesome.