| ▲ | matanyall 6 hours ago | |
It's so funny, I've never done a cost-benefit analysis of having "good monitoring" and then still not being able to figure out what broke and needing to pull in someone who doesn't need the monitoring at all because they built the thing. | ||
| ▲ | pixl97 3 hours ago | parent [-] | |
It's probably something along the lines of "Monitoring solves the problems you expect to have". For example you don't even question when you see latency going up on some service, you can see DB load going up, and you either manually, or script out another instance starting up. Monitoring all this stuff allows you to call the DBA/app team/etc 20 minutes sooner when you see some component screw off an you have no idea why. Hopefully that person on the app team puts in a new means of showing what the problem was if it ever happens again, then it turns into the first type of problem you never thing about again (or hope was actually fixed in the application). | ||