Remix.run Logo
AlotOfReading 2 days ago

In a former job, I wrote a static PWA to do initial provisioning for robots. A tech would load the page, generate a QR code, and put it in front of the camera to program the robot.

When I looked into having this static page hosted on internal infra, it would have also needed minimum two dedicated oncalls, terraform, LB, containerization, security reviews, SLAs, etc.

I gave up after the second planning meeting and put it on my $5 VPS with a letsencrypt cert. That static page is still running today, having outlived not only the production line, but also the entire company.

Aurornis 2 days ago | parent | next [-]

> When I looked into having this static page hosted on internal infra, it would have also needed minimum two dedicated oncalls, terraform, LB, containerization, security reviews, SLAs, etc.

In my experience there are two kinds of infrastructure or platform teams:

1) The friendly team trying to help everyone get things done with reasonable tradeoffs appropriate for the situation

2) The team who thinks their job is to make it as hard as possible for anyone to launch anything unless it satisfies their 50-item checklist of requirements and survives months of planning meetings where they try to flex their knowledge on your team by picking the project apart.

In my career it’s been either one or the other. I know it’s a spectrum and there must be a lot of room in the middle, yet it’s always been one extreme or the other for me.

stackskipton 2 days ago | parent | next [-]

Having work as Ops person at second place, most of time it ends up like that because Ops become dumping ground so they throw up walls in vain attempt to stem the tide. Application throwing error logs? Page out Ops. We made a mistake in our deploy. Oh well, page out Ops so they can help us recover. Security has found a vulnerability in a Java library? Sounds like Ops problem to us.

nucleardog 2 days ago | parent | prev [-]

I've observed the same and, anecdotally, how much of a pain the ops team is is a function of how much responsibility shifts from the dev team to the ops team during the project lifecycle.

Basically, is the ops team there to support the developer/development team, or the product?

In the case of the development team, the ops team will tend to be willing to provide advice and suggestions but be flexible on tooling and implementation. In the case of the product, the ops team will tend to be a lot more rigid and inflexible.

This plays out in things like:

When the PWA becomes critical for the production line and then "is not working" at 3AM, who is getting paged? If it's the developer, then ops is "supporting the developer". If it's the ops team getting called to debug and fix some project they've never laid eyes on before at 3AM, then it's the product. They are, naturally, going to start caring a lot more about how it is set up, deployed, and supported because nobody likes getting woken for work at 3AM.

When some project's dependencies start running past EOL, who is going to update it? If it's the developer, then ops is "supporting the developer". If the ops team isn't empowered to give a deadline and have _someone else_ responsible for keeping the project functioning, then they're supporting the product and by letting it be deployed effectively committed to maintaining it in perpetuity and they're going to start caring a lot more about what sort of languages, frameworks, etc are used and specifically how projects are set up because context switching to one of dozens of different projects at 3AM is hard enough as-is without having to also be trying to learn some new framework du jour.

(And before anyone says "well the updates probably aren't necessary this is just ops being a pain"--think of the case of a project relying on GCP product that's being shutdown or some kubernetes resource that's been changed. In one case inaction will cause the project to fail, in the other ops' action will cause it to fail. See the first point as to who is going to get called about that. Even in the happy case, consistency brings automation and allows the team to support a _class_ of deployments instead of individual products.)

I don't think places exist stably in the middle ground because it's a painful place to be for very long. The responsibility and the control land on separate people, and the person with the responsibility but without the control is generally going to work to wrestle control to reduce misery. In the case where ops acts as if they're supporting the developers but is in practice supporting the product, it's not going to take too many 3AM calls before they start pushing back on how the product's deployed and supported.

I've been both of those ops guys you describe. When I was the "checklists, meetings, and picking the project apart" guy it had nothing to do with me wanting to make anyone's life difficult or flexing my knowledge. It had to do with the 3AM calls waking myself, my wife, and my newborn up. If I was taking on responsibility for keeping your _product_ functional through its useful life, yeah, I wasn't going to let people dump stuff on my plate unless I had some reasonable basis to believe it wasn't going to substantially increase my workload and result in more middle of the night calls. The checklists were my way of trying to provide consistency and visibility into the process of reducing my own pain, not my way of trying to create pain for others.

beng-nl 2 days ago | parent | prev [-]

I am reminded of “I forgot how to count that low”

https://news.ycombinator.com/item?id=28988281