The initial cost of serving is very high, and while super performant not great for scaling up.
In practice they are also not very flexible when compared to gpus.