▲ | saagarjha 6 days ago | |
40% seems quite lightly utilized tbh | ||
▲ | cpncrunch 6 days ago | parent | next [-] | |
I tend to use 50% as a soft target, which seems like a good compromise. Sometimes it may go a little bit over that, but if it's occasional it shouldn't be an issue. It's not good to go much over 50% on a server (assuming half the cpus are just hyperthreads), because you're essentially relying on your load being able to share the actual cpu cores. At some point, when the load increases too much, there may not be any headroom left for sharing those physical cpus. You then get to the point where adding a little bit more load to 80% suddenly results in 95% utilization. | ||
▲ | kqr 6 days ago | parent | prev | next [-] | |
It depends on how variable the load is, compared to how fast the servers can scale up and down, etc. I often have as a rule of thumb to have enough headroom to be able to deal with twice the load while staying within a triple of the response time. You can solve the equations for your specific case, but eyeballing graphs such as [1] I end up somewhere in the area of 40 %. The important part is of course to ask yourself the question "how much increased load may I need to handle, and how much can I degrade system performance in doing so?" You may work in an industry that only ever sees 10 % additional load at timescales where scaling is unfeasible, and then you can pick a significantly higher normal utilisation level. Or maybe you're in an industry where you cannot degrade performance by more than 10 % even if hit by five times the load – then you need a much, much more conservative target for utilisation. | ||
▲ | paravz 5 days ago | parent | prev [-] | |
Cpu utilization %% needs to be contrasted with a "business" metric like latency or RPS. Depending on the environment and hardware 40% can be too utilized or way underutilized |