I can imagine a couple scenarios in which a high-quality, large model would be much preferred over lower latency models, primarily when you need the quality.