▲ | giovannibonetti 3 days ago | |||||||||||||||||||||||||||||||
Even better, based on queue latency instead of length | ||||||||||||||||||||||||||||||||
▲ | jcrites 3 days ago | parent [-] | |||||||||||||||||||||||||||||||
The single best metric I've found for scaling things like this is the percent of concurrent capacity that's in use. I wrote about this in a previous HN comment: https://news.ycombinator.com/item?id=41277046 Scaling on things like the length of the queue doesn't work very well at all in practice. A queue length of 100 might be horribly long in some workloads and insignificant in others, so scaling on queue length requires a lot of tuning that must be adjusted over time as the workload changes. Scaling based on percent of concurrent capacity can work for most workloads, and tends to remain stable over time even as workloads change. | ||||||||||||||||||||||||||||||||
|