| ▲ | taupi 5 hours ago | |
There's a new "Effective Load" metric that we've looked at -- it's derived from Power, which has the same problems we mentioned here: https://news.ycombinator.com/item?id=47925149 It's useful as a rough heuristic, but tends to overestimate utilization. We've also noticed that power-derived metrics have a lag time behind true utilization, the controller that regulates it has a delayed response time. This especially becomes important for spiky workloads like real-time inference. Any tool (like nvtop) that only queries NVIDIA's NVML library does not have access to the detailed metrics that we draw upon, and therefore has to use proxies for efficiency. | ||