True — I think local inference is still far more expensive for my use case due to batching effects and my relatively sporadic, hourly usage. That said, I also didn’t expect hardware prices (RTX 5090, RAM) to rise this quickly.