Remix.run Logo
0xbadcafebee 2 days ago

I am 40 yrs old. I get paid a shit-ton of money (just around $200K) to do this stupid tech work job. I work 40 hours a week, I get benefits, flex time, plus I work remote.

If I'm getting paged for a legitimate issue that is related to something I built or maintain, then, yes, I am going to respond on-call. Because it's a fucking privilege to get paid this much money to sit on my ass and type into a screen.

If I'm getting paged repeatedly, or for an issue that isn't my responsibility, then I will get pissed off, and yell and scream until I'm no longer on-call (or they fix the issue, whichever comes first). But I am grateful to be able to have this life. I can spend an hour or two after hours to fix my shit that broke.

majormajor 2 days ago | parent | next [-]

An on-call rotation without sufficient influence over the roadmap and planning to be able to fix persistent problems so they don't repeatedly cause the same issues over and over and over is toxic. And it's gonna kill the team's overall productivity so it's not good for management either. Congrats, you're playing SWE salaries for an ops team that would traditionally cost you less otherwise.

In a more healthy situation an on-call rotation is the price of being able to move quickly, get stuff out the door, and have compensation that reflects that the company isn't paying a whole team of extra people to stare at dashboards 24/7 just for the rare situations that things break after-hours.

Gigs with low-overhead + customers that don't expect 24/7 operations are kinda the real sweet-spot dev compensation + role-wise, but ... pretty rare.

0xbadcafebee 2 days ago | parent [-]

Well, I have two thoughts about that:

1) gigs without 24/7 operations are rare, because there is no good reason for a tech product not to be 24/7. it's not costing extra electricity to keep the lights on overnight, nor more staff. there are a bunch of these gigs (my last gig had no customers for 2+ years) but you shouldn't expect them, because part of the reason we're paid so much money is we're expected to deliver "continuous value". most devs would agree with this, because they all want to be able to deploy continuously, whenever they want. (which is a terrible idea, but it is the status quo.) furthermore, if you're doing your job right (and so is Ops), supporting a 24/7 product should not result in on-call pages, because nothing should be breaking outside regular business hours. if it is breaking outside regular hours, somebody sucks at their job. and Ops' job is pretty simple, so...

2) you do have lots of control over the roadmap, planning, etc. but nobody is going to walk up to you and say "hey we were just thinking of maybe doing this in the roadmap, is that okay with you?" you have to get involved, early, and consistently. you have to show you're not going to rock the boat, but that you will have good suggestions, and can show they will turn into better outcomes. you have to play a little politics, a little product ownership, and also an engineering role, in order to influence what the business decides to do. as you get more senior this gets easier because people will defer to you more, but even an extremely likeable junior can influence the roadmap.

on the off-chance that you're just trapped in engineering hell, with hostile management, a terrible product, and a completely apathetic and terrified staff, quit immediately. this isn't normal and you shouldn't think "oh, I'm trapped here." people don't stay in abusive relationships because there's no other choice, they stay because they've justified their own abuse.

tbihl 2 days ago | parent | prev [-]

You haven't lived until you've spent a whole weekend at work rushing to fix a production-limiting issue because the boss doesn't know, though you do, about the other division's production-limiting issue which cannot, under any wildly optimistic circumstance, get done in the next two weeks.

Oh, and that weekend is the weekend before Christmas.

0xbadcafebee 2 days ago | parent [-]

Oh, I have so, so many on-call stories. The one of "these other people are making our lives miserable" is hard to deal with, but there are paths you can take to get them to work on it. Sometimes it's just not feasible (or is risky) to get them to take more ownership in the short-term. So it's really important to do your own job to establish all the potential failure paths, and set up lines of ownership, make sure your dependencies have their shit together (performance testing, trend analysis, alerts, limits, runbooks, etc) so that when they do inevitably fail you can push back.

I have never been at a job where on-call was done as well as it could be, and most were/are pretty bad in general. But I could always get changes made to on-call, so that when shit started rolling down hill, it didn't hit me.