Not sure I can think of anything that is more performant per watt for LLMs than Apple Silicon.
A datacenter GPU is going to be an order of magnitude more efficient.