| ▲ | yorwba 4 hours ago | |
Your source cites https://www.washingtonpost.com/technology/2024/09/18/energy-... which in turn claims to be based on https://arxiv.org/abs/2304.03271 but uses 0.14 kWh as the energy consumption for a 100-token request to GPT-4, which is an order of magnitude larger than any figure in that paper. Based on a speed of 18 tokens/s https://openrouter.ai/openai/gpt-4/performance the implied power draw is ≈91 kW, about two thirds of a 72-GPU rack https://www.supermicro.com/datasheet/datasheet_SuperCluster_... I somewhat doubt that the model is large enough to require an entire rack's worth of GPU memory, but even if that were the case, a single request is going to get batched with hundreds or thousands of others at the same time, so the true energy consumption should be much smaller than that. | ||