Remix.run Logo
AznHisoka 14 hours ago

I disagree with that last part. By default, having a few seconds downtime is not complex. The easiest thing you could do to a server is restart it. Its literally just a restart!

valenterry 14 hours ago | parent | next [-]

It's not. Imagine there is a bug that stops the app from starting. It could be anything, from a configuration error (e.g. against the database) to a problem with warmup (if necessary) or any kind of other bug like an exception that only triggers in production for whatever reasons.

EDIT: and worse, it could be something that just started and would even happen when trying to deploy the old version of the code. Imagine a database configuration change that allows the old connections to stay open until they are closed but prevents new connections from being created. In that case, even an automatic roll back to the previous code version would not resolve the downtime. This is not theory, I had those cases quite a few times in my career.

globular-toast 11 hours ago | parent | prev [-]

I managed a few production services like this and it added a lot of overhead to my work. On the one hand I'd get developers asking me why their stuff hasn't been deployed yet. But then I'd also have to think carefully about when to deploy and actually watch it to make sure it came back up again. I would often miss deployment windows because I was doing something else (my real job).

I'm sure there are many solutions but K8s gives us both fully declarative infrastructure configs and zero downtime deployment out of the box (well, assuming you set appropriate readiness probes etc)

So now I (a developer) don't have to worry about server restarts or anything for normal day to day work. We don't have a dedicated DevOps/platforms/SRE team or whatnot. Now if something needs attention, whatever it is, I put my k8s hat on and look at it. Previously it was like "hmm... how does this service deployment work again..?"