Remix.run Logo
FinnDituri 4 hours ago

While I appreciate the irony in the trend of using AI to discredit people making positive claims about AI, it's a pet peeve of mine when it's used as a lazy way to avoid citing the original claim made against AI. It's reminiscent of the 'no you' culture from early 2000s forums. There's some meta-irony here too in that it often has to be debunked by humans, maybe that's the point, but it doesn't diminish my opinion of LLMs, it just makes me think that the Luddites may have had a point.

For instance, in the Gemini screenshot, the claim for 100-500x more resource usage for AI queries comes from water usage, however it's not clear to me why data center water usage for AI queries would be 100-500x more than a Google search when power usage for an AI query is supposedly only 10-30x more than a Google search. Is water usage and CO2 footprint not derived from power consumption? Did the LLM have to drink as much water while thinking as I did while researching the original claim?

The 10-30x more power consumption claim seems to come from this scientific paper [0] from late 2023 which cites a news article which quotes Alphabet's chairman as saying 'a large language model likely cost 10 times more than a standard keyword search, [though fine-tuning will help reduce the expense quickly]'. Editorialising the quote is not a good look for a scientific paper. The paper also cites a news letter from an analyst firm [1] that performs a back of the envelope calculation to estimate OpenAI's costs, looks at Google's revenue per search, and estimates how much it would cost Google to add an AI query for every Google search. Treating it like a Fermi Problem is reasonable I guess, you can get within an order of magnitude if your guesstimates are reasonable. The same analyst firm did a similar calculation [2] and came to the conclusion that training a dense 1T model costs $300m. It should be noted that GPT-4 cost 'more than $100m' and it has been leaked that it's a 1.8T MoE. LLama 3.1 405B was around 30M GPU hours, likely $30-60m. DeepSeek, a 671B MoE, was trained for around $5m. However, while this type of analysis is fine for a news letter, citing it to see how many additional servers Google would need to add an AI query to every search, taking the estimated power consumption of those servers, and deriving a 6.9–8.9 Wh figure per request for the amount of search queries Google receives is simply beyond my comprehension. I gave up trying to make sense of what this paper is doing, and this summary may be a tad unfair as a result. You can run the paper through Gemini if you would prefer an unbiased summary if you prefer :-).

The paper also cites another research paper [3] from late 2022 which estimates a dense 176b parameter model (comparable to GPT-3) uses 3.96 Wh per request. They derive this figure by running the model in the cloud. What a novel concept. Given the date of the paper, I wouldn't be surprised if they ran the model in the original BF16 weights, although I didn't check. I could see this coming down to 1 Wh per request when quantised to INT4 or similar, and with better caching/batched requests/utilisation/modern GPUs/etc I could see this getting pretty close to the often quoted [4, from 2009 mind] 0.3 Wh per Google search.

Google themselves [5] state the median Gemini text prompt uses 0.24 Wh.

I simply don't see where 100x is coming from. 10x is something I could believe if we're factoring in training resource consumption as some extremely dodgy napkin maths is leading me to believe a moderately successful 1T~ model gets amortised to 3 Wh per prompt which subjectively is pretty close to the 3x claim I've ended up defending. If we're going this route we'd have to include the total consumption for search too as I have no doubt Google simply took the running consumption divided by amount of searches. Add in failed models, determine how often either a Google search or AI query is successful, factor in how much utility the model providing the information provides as it's clearly no longer just about power efficiency, etc. There's a lot to criticise about GenAI but I really don't think Google searches being marginally more power efficient is one of them.

[0] https://www.sciencedirect.com/science/article/pii/S254243512...

[1] https://newsletter.semianalysis.com/p/the-inference-cost-of-...

[2] https://newsletter.semianalysis.com/p/the-ai-brick-wall-a-pr...

[3] https://arxiv.org/abs/2211.02001

[4] https://googleblog.blogspot.com/2009/01/powering-google-sear...

[5] https://cloud.google.com/blog/products/infrastructure/measur...