Remix.run Logo
0xbadcafebee 2 days ago

Oh, I have so, so many on-call stories. The one of "these other people are making our lives miserable" is hard to deal with, but there are paths you can take to get them to work on it. Sometimes it's just not feasible (or is risky) to get them to take more ownership in the short-term. So it's really important to do your own job to establish all the potential failure paths, and set up lines of ownership, make sure your dependencies have their shit together (performance testing, trend analysis, alerts, limits, runbooks, etc) so that when they do inevitably fail you can push back.

I have never been at a job where on-call was done as well as it could be, and most were/are pretty bad in general. But I could always get changes made to on-call, so that when shit started rolling down hill, it didn't hit me.