new | show | ask | jobs Github

__jl__ 5 days ago

This is awesome. No preview release either, which is great to production.

They are pushing the prices higher with each release though: API pricing is up to $0.5/M for input and $3/M for output

For comparison:

Gemini 3.0 Flash: $0.50/M for input and $3.00/M for output

Gemini 2.5 Flash: $0.30/M for input and $2.50/M for output

Gemini 2.0 Flash: $0.15/M for input and $0.60/M for output

Gemini 1.5 Flash: $0.075/M for input and $0.30/M for output (after price drop)

Gemini 3.0 Pro: $2.00/M for input and $12/M for output

Gemini 2.5 Pro: $1.25/M for input and $10/M for output

Gemini 1.5 Pro: $1.25/M for input and $5/M for output

I think image input pricing went up even more.

Correction: It is a preview model...

▲

mips_avatar 5 days ago | parent | next [-]

I'm more curious how Gemini 3 flash lite performs/is priced when it comes out. Because it may be that for most non coding tasks the distinction isn't between pro and flash but between flash and flash lite.

▲

KoolKat23 5 days ago | parent | prev | next [-]

Token usage also needs to be factored in specifically when thinking is enabled, these newer models find more difficult problems easier and use less tokens to solve.

▲

srameshc 5 days ago | parent | prev | next [-]

Thanks that was a great breakup of cost. I just assumed before that it was the same pricing. The pricing probably comes from the confidence and the buzz around Gemini 3.0 as one of the best performing models. But competetion is hot in the area and it's not too far where we get similar performing models for cheaper price.

▲

YetAnotherNick 5 days ago | parent | prev | next [-]

For comparison, GPT-5 mini is $0.25/M for input and $2.00/M for output, so double the price for input and 50% higher for output.

	▲	AuthError 5 days ago \| parent [-]
		flash is closer to sonnet than gpt minis though

▲

martythemaniak 5 days ago | parent | prev | next [-]

The price increase sucks, but you really do get a whole lot more. They also had the "Flash Lite" series, 2.5 Flash Lite is 0.10/M, hopefully we see something like 3.0 Flash Lite for .20-.25.

▲

sunaookami 5 days ago | parent | prev | next [-]

This is a preview release.

	▲	reed1234 4 days ago \| parent [-]
		https://openrouter.ai/google/gemini-3-flash-preview

▲

uluyol 5 days ago | parent | prev | next [-]

Are these the current prices or the prices at the time the models were released?

	▲	__jl__ 5 days ago \| parent [-]
		Mostly at the time of release except for 1.5 Flash which got a price drop in Aug 2024. Google has been discontinuing older models after several months of transition period so I would expect the same for the 2.5 models. But that process only starts when the release version of 3 models is out (pro and flash are in preview right now).

▲

misiti3780 5 days ago | parent | prev [-]

is there a website where i can compare openai, anthropic and gemini models on cost/token ?

▲

jsnell 5 days ago | parent | next [-]

There are plenty. But it's not the comparison you want to be making. There is too much variability between the number of tokens used for a single response, especially once reasoning models became a thing. And it gets even worse when you put the models into a variable length output loop.

You really need to look at the cost per task. artificialanalysis.ai has a good composite score, measures the cost of running all the benchmarks, and has 2d a intelligence vs. cost graph.

▲

misiti3780 5 days ago | parent [-]

thanks

	▲	deaux 4 days ago \| parent [-]
		For reference the above completely depends on what you're using them for. For many tasks, the number of tokens used is consistent within 10~20%.

▲

deaux 4 days ago | parent | prev | next [-]

https://www.helicone.ai/llm-cost

Tried a lot of them and settled on this one, they update instantly on model release and having all models on one page is the best UX.

▲

rrhartjr 4 days ago | parent | prev | next [-]

https://www.llm-prices.com/

▲

int_19h 5 days ago | parent | prev [-]

https://openrouter.ai/models