Remix.run Logo
jstummbillig 12 hours ago

Nope, you can't, and it takes a simple Gemini query to find out more about the actual x if you are interested in it. (closer to 3, last time I checked, which rounds to 0, specially considering the clicks you save when using the LLM)

oblio 12 hours ago | parent [-]

> jstummbillig:

> Nope, you can't, and it takes a simple Gemini query to find out more about the actual x if you are interested in it. (closer to 3, last time I checked, which rounds to 0, specially considering the clicks you save when using the LLM)

Why would you lie: https://imgur.com/a/1AEIQzI ???

For those that don't want to see the Gemini answer screenshot, best case scenario 10x, worst case scenario 100x, definitely not "3x that rounds to 0x", or to put it in Gemini's words:

> Summary

> Right now, asking Gemini a question is roughly the environmental equivalent of running a standard 60-watt lightbulb for a few minutes, whereas a Google Search is like a momentary flicker. The industry is racing to make AI as efficient as Search, but for now, it remains a luxury resource.

jstummbillig 12 hours ago | parent | next [-]

Are you okay? You ventured 100x and that's wrong. What would you know about the last time I checked was, and in what context exactly? Anyway, good job on doing what I suggest you do, I guess.

The reason why it all rounds to 0 is that the google search will not give you an answer. It gives you a list of web pages, that you then need to visit (often times more than just one of them) generating more requests, and, more importantly, it will ask more of your time, the human, whose cumulative energy expenditure to be able to ask to be begin with is quite significant – and that you then will have not to spend on other things that a LLM is not able to do for you.

lokar 11 hours ago | parent | next [-]

Serving a request for (often mostly static) content like that uses a tiny tiny amount of energy.

oblio 12 hours ago | parent | prev [-]

You condescendingly said, sorry, you "ventured" 0x usage, by claiming: "use Gemini to check yourself that the difference is basically 0". Well, I did take you up on that, and even Gemini doesn't agree with you.

Yes, Google Search is raw info. Yes, Google Search quality is degrading currently.

But Gemini can also hallucinate. And its answers can just be flat out wrong because it comes from the same raw data (yes, it has cross checks and it "thinks", but it's far from infallible).

Also, the comparison of human energy usage with GenAI energy usage is super ridiculous :-)))

Animal intelligence (including human intelligence) is one of the most energy efficient things on this planet, honed by billions years of cut throat (literally!) evolution. You can argue about time "wasted" analysing search results (which BTW, generally makes us smarter and better informed...), but energy-wise, the brain of the average human uses as much energy as the average incandescent light bulb to provide general intelligence (and it does 100 other things at the same time).

jstummbillig 12 hours ago | parent [-]

Ah, we are in "making up quotes territory, by putting quotation marks around the things someone else said, only not really". Classy.

Talking about "condescending":

> super ridiculous :-)))

It's not the energy efficient animal intelligence that got us here, but a lot of completely inefficient human years to begin with, first to keep us alive and then to give us primary and advanced education and our first experiences to become somewhat productive human beings. This is the capex of making a human, and it's significant – specially since we will soon die.

This capex exists in LLMs but rounds to zero, because one model will be used for +quadrillions of tokens. In you or me however, it does not round to zero, because the number of tokens we produce round to zero. To compete on productivity, the tokens we have produce therefore need to be vastly better. If you think you are doing the smart thing by using them on compiling Google searches you are simply bad at math.

lokar 11 hours ago | parent | prev | next [-]

Google web search is incredibly efficient

oblio 7 hours ago | parent [-]

So are most procedural services out there, i.e. non-GenAI. Otherwise we couldn't have built them on infrastructure with 10000x less computing power than the GenAI infrastructure they're building now.

FinnDituri 6 hours ago | parent | prev [-]

While I appreciate the irony in the trend of using AI to discredit people making positive claims about AI, it's a pet peeve of mine when it's used as a lazy way to avoid citing the original claim made against AI. It's reminiscent of the 'no you' culture from early 2000s forums. There's some meta-irony here too in that it often has to be debunked by humans, maybe that's the point, but it doesn't diminish my opinion of LLMs, it just makes me think that the Luddites may have had a point.

For instance, in the Gemini screenshot, the claim for 100-500x more resource usage for AI queries comes from water usage, however it's not clear to me why data center water usage for AI queries would be 100-500x more than a Google search when power usage for an AI query is supposedly only 10-30x more than a Google search. Is water usage and CO2 footprint not derived from power consumption? Did the LLM have to drink as much water while thinking as I did while researching the original claim?

The 10-30x more power consumption claim seems to come from this scientific paper [0] from late 2023 which cites a news article which quotes Alphabet's chairman as saying 'a large language model likely cost 10 times more than a standard keyword search, [though fine-tuning will help reduce the expense quickly]'. Editorialising the quote is not a good look for a scientific paper. The paper also cites a news letter from an analyst firm [1] that performs a back of the envelope calculation to estimate OpenAI's costs, looks at Google's revenue per search, and estimates how much it would cost Google to add an AI query for every Google search. Treating it like a Fermi Problem is reasonable I guess, you can get within an order of magnitude if your guesstimates are reasonable. The same analyst firm did a similar calculation [2] and came to the conclusion that training a dense 1T model costs $300m. It should be noted that GPT-4 cost 'more than $100m' and it has been leaked that it's a 1.8T MoE. LLama 3.1 405B was around 30M GPU hours, likely $30-60m. DeepSeek, a 671B MoE, was trained for around $5m. However, while this type of analysis is fine for a news letter, citing it to see how many additional servers Google would need to add an AI query to every search, taking the estimated power consumption of those servers, and deriving a 6.9–8.9 Wh figure per request for the amount of search queries Google receives is simply beyond my comprehension. I gave up trying to make sense of what this paper is doing, and this summary may be a tad unfair as a result. You can run the paper through Gemini if you would prefer an unbiased summary if you prefer :-).

The paper also cites another research paper [3] from late 2022 which estimates a dense 176b parameter model (comparable to GPT-3) uses 3.96 Wh per request. They derive this figure by running the model in the cloud. What a novel concept. Given the date of the paper, I wouldn't be surprised if they ran the model in the original BF16 weights, although I didn't check. I could see this coming down to 1 Wh per request when quantised to INT4 or similar, and with better caching/batched requests/utilisation/modern GPUs/etc I could see this getting pretty close to the often quoted [4, from 2009 mind] 0.3 Wh per Google search.

Google themselves [5] state the median Gemini text prompt uses 0.24 Wh.

I simply don't see where 100x is coming from. 10x is something I could believe if we're factoring in training resource consumption as some extremely dodgy napkin maths is leading me to believe a moderately successful 1T~ model gets amortised to 3 Wh per prompt which subjectively is pretty close to the 3x claim I've ended up defending. If we're going this route we'd have to include the total consumption for search too as I have no doubt Google simply took the running consumption divided by amount of searches. Add in failed models, determine how often either a Google search or AI query is successful, factor in how much utility the model providing the information provides as it's clearly no longer just about power efficiency, etc. There's a lot to criticise about GenAI but I really don't think Google searches being marginally more power efficient is one of them.

[0] https://www.sciencedirect.com/science/article/pii/S254243512...

[1] https://newsletter.semianalysis.com/p/the-inference-cost-of-...

[2] https://newsletter.semianalysis.com/p/the-ai-brick-wall-a-pr...

[3] https://arxiv.org/abs/2211.02001

[4] https://googleblog.blogspot.com/2009/01/powering-google-sear...

[5] https://cloud.google.com/blog/products/infrastructure/measur...