Remix.run Logo
AkelaA 10 hours ago

I think it's funny that at no point in the article do they mention the idea of simply making LLMs more efficient. I guess that's not important when all you care about is winning the AI "race" rather then selling a long term sustainable product.

redox99 9 hours ago | parent | next [-]

If you make it more efficient, then you train it for longer or make it larger. You're not going to just idle your GPUs.

And yes of course it's a race, everything being equal nobody's going to use your model if someone else has a better model.

cl0ckt0wer 10 hours ago | parent | prev | next [-]

They are already power-constrained. Any efficiency improvements would immediately be allocated to more AI.

inkysigma 9 hours ago | parent | prev [-]

What makes you think that the entire process isn't being made more efficient? There are entire papers dedicated to pulling out more FLOPs from GPUs so that less energy is being wasted on simply moving memory around. Of course, there's also inference side optimizations like speculative decoding and MoE. Some of these make the training process more expensive.

The other big problem is that you can always increase the scale to compensate for the energy efficiency. I do wonder if they'll eventually level this off though. If performance somehow plateaus then presumably the efficiency gains will catch up. That being said, that doesn't seem to be a thing in the near future.