As others have pointed out, this is false. Google has made their models and hardware more efficient, you can read the linked report. Most of the efficiency comes from quantization, MoE, new attention techniques, and distillation (making smaller models useable in place of bigger models)

▲

jjani 3 days ago | parent | next [-]

- The report doesn't name any Gemini models at all, only competitors'. Wonder why that is? If the models got so much more efficient, they'd be eager to show this.

- The report doesn't name any averages (means), only medians. Why oh why would they be doing this, when all other marketing pieces always use the average because outside of HN 99% of Joes on the street have no idea what a median is/how it differs from the mean? The average is much more relevant here when "measuring the environmental impact of AI inference".

- The report doesn't define what any of the terms "Gemini Apps", "the Gemini AI assistant" or "Gemini Apps text prompt" concretely mean

▲

jsnell 2 days ago | parent | next [-]

The report also doesn't define what the word "AI" means. What are they trying to hide?!

In reality, we know what Google means by the term "Gemini Apps", because it's a term they've had to define for e.g. their privacy policies[0].

> The Gemini web app available through gemini.google.com and browser sidebars

> The Gemini mobile apps, which include:

> The Gemini app, including as your mobile assistant, on Android. Note that Gemini is hosted by the Google app, even if you download the Gemini app.

> The Gemini app on iOS

> Gemini in the Google Messages app in specific locations

> The Gemini in Chrome feature. Learn more about availability.

That established definition does not include AI summaries (actually AI Overviews) on search like you very claimed. And it's something where Google probably is going to be careful -- the "Gemini Apps" name is awkward, but they need a name that distinguishes these use cases from other AI use cases with different data boundaries / policies / controls.

If the report was talking about "Gemini apps", your objection might make sense.

[0] https://support.google.com/gemini/answer/13594961?hl=en

	▲	jjani 2 days ago \| parent [-]
		It's very strange that we'd have to dive into their privacy policy to get a clear definition of it, but good spot. The rest stands though - no models, no averages. User tovej below put it better than I did: > The median does not move if the upper tail shifts, it only moves if the median moves. > The fact that they do not report the mean is concerning. The mean captures the entire distribution and could actually be used to calculate the expected value of energy used. > The median only tells you which point separates the upper half from the lower half, if you don't know anything else about the distribution you cannot use it for any kind of analysis 49% of queries could be costing 1000x that median. Stats 101 combined with a sliver of critical reading reveals this report isn't worth the bytes it's taking up.

▲

scott_w 3 days ago | parent | prev [-]

To be fair, the report explains their reasoning: they state the mean is too sensitive to outliers.

Now, I do agree it would have been nice to demonstrate this, however it could be genuine.

▲

jjani 3 days ago | parent [-]

That's a complete cop out. They didn't give the data to back this up.

	▲	scott_w 2 days ago \| parent [-]
		They definitely should have shown an example, or referenced something else that backs up their claim. I think it was you who made the good point that, when it comes to data usage, the mean may well give you more meaningful information because of the outliers! I can see the median being useful for answering what the cost of one more server/agent/whatever would be, but that’s not what this paper is asking.

▲

oulipo2 2 days ago | parent | prev [-]

sure, but the issue is if you make the model 30x more efficient, but you use it 300x more often (mostly for stuff nobody wants), it's still a net loss

▲

mgraczyk 2 days ago | parent [-]

Would you say that computers are less efficient now than they were in the 90s because they are more widely used?

	▲	jononor 2 days ago \| parent [-]
		Not less efficient. But the impact on resources usage is still higher. Of course the impact in terms of positive effects is also higher. So the cost/benefit may also have gone up.