Remix.run Logo
zug_zug a day ago

For what it's worth, I've worked at multiple places that ran shell scripts just fine for their deploys.

- One had only 2 services [php] and ran over 1 billion requests a day. Deploy was trivial, ssh some new files to the server and run a migration, 0 downtime.

- One was in an industry that didn't need "Webscale" (retirement accounts). Prod deploys were just docker commands run by jenkins. We ran two servers per service from the day I joined the day I left 4 years later (3x growth), and ultimately removed one service and one database during all that growth.

Another outstanding thing about both of these places was that we had all the testing environments you need, on-demand, in minutes.

The place I'm at now is trying to do kubernetes and is failing miserably (ongoing nightmare 4 months in and probably at least 8 to go, when it was allegedly supposed to only take 3 total). It has one shared test environment that it takes 3-hours to see your changes in.

I don't fault kubernetes directly, I fault the overall complexity. But at the end of the day kubernetes feels like complexity trying to abstract over complexity, and often I find that's less successful that removing complexity in the first place.

YZF 18 hours ago | parent | next [-]

If your application doesn't need and likely won't need to scale to large clusters, or multiple clusters, then there's nothing wrong per se. with your solution. I don't think k8s is that hard but there are a lot of moving pieces and there's a bit to learn. Finding someone with experience to help you can make a ton of difference.

Questions worth asking:

- Do you need a load balancer?

- TLS certs and rotation?

- Horizontal scalability.

- HA/DR

- dev/stage/production + being able to test/stage your complete stack on demand.

- CI/CD integrations, tools like ArgoCD or Spinnaker

- Monitoring and/or alerting with Prometheus and Grafana

- Would you benefit from being able to deploy a lot of off the shelf software (lessay Elastic Search, or some random database, or a monitoring stack) via helm quickly/easily.

- "Ingress"/proxy.

- DNS integrations.

If you answer yes to many of those questions there's really no better alternative than k8s. If you're building large enough scale web applications the almost to most of these will end up being yes at some point.

xorcist 16 hours ago | parent | next [-]

Every item on that list is "boring" tech. Approximately everyone have used load balancers, test environments and monitoring since the 90s just fine. What is it that you think make Kubernetes especially suited for this compared to every other solution during the past three decades?

There are good reasons to use Kubernetes, mainly if you are using public clouds and want to avoid lock-in. I may be partial, since managing it pays my bills. But it is complex, mostly unnecessarily so, and no one should be able to say with a straight face that it achieves better uptime or requires less personnel than any alternative. That's just sales talk, and should be a big warning sign.

YZF 15 hours ago | parent | next [-]

It's the way things work together. If you want to add a new service you just annotate that service and DNS gets updated, your ingress gets the route added, cert-manager gets you the certs from let's encrypt. You want Prometheus to monitor your pod you just add the right annotation. When your server goes down k8s will move your pod around. k8s storage will take care of having the storage follow your pod. Your entire configuration is highly available and replicated in etcd.

It's just very different than your legacy "standard" technology.

gr3ml1n 13 hours ago | parent [-]

None of this is difficult to do or automate, and we've done it for years. Kubernetes simply makes it more complex by adding additional abstractions in the pursuit of pretending hardware doesn't exist.

There are, maybe, a dozen companies in the world with a large enough physical footprint where Kubernetes might make sense. Everyone else is either engaged in resume-driven development, or has gone down some profoundly wrong path with their application architecture to where it is somehow the lesser evil.

sampullman 12 hours ago | parent [-]

I used to feel the same way, but have come around. I think it's great for small companies for a few reasons. I can spin up effectively identical dev/ci/stg/prod clusters for a new project in an hour for a medium sized project, with CD in addition to everything GP mentioned.

I basically don't have to think about ops anymore until something exotic comes up, it's nice. I agree that it feels clunky, and it was annoying to learn, but once you have something working it's a huge time saver. The ability to scale without drastically changing the system is a bonus.

gr3ml1n 10 hours ago | parent | next [-]

> I can spin up effectively identical dev/ci/stg/prod clusters for a new project in an hour for a medium sized project, with CD in addition to everything GP mentioned.

I can do the same thing with `make local` invoking a few bash commands. If the complexity increases beyond that, a mistake has been made.

xorcist 7 hours ago | parent | prev [-]

You could say the same thing about Ansible or Vagrant or Nomad or Salt or anything else.

I can say with complete confidence however, that if you are running Kubernetes and not thinking about ops, you are simply not operating it yourself. You are paying someone else to think about it for you. Which is fine, but says nothing about the technology.

lmm 10 hours ago | parent | prev | next [-]

> Every item on that list is "boring" tech. Approximately everyone have used load balancers, test environments and monitoring since the 90s just fine. What is it that you think make Kubernetes especially suited for this compared to every other solution during the past three decades?

You could make the same argument against using cloud at all, or against using CI. The point of Kubernetes isn't to make those things possible, it's to make them easy and consistent.

drw85 3 hours ago | parent | next [-]

But none of those things are easy. All cloud environments are fairly complex and kubernetes is not something that you just do in an afternoon. You need to learn about how it works, which takes about the same time as using 'simpler' means to do things directly.

Sure, it means that two people that already understand k8s can easily exchange or handover a project, which might be harder to understand if done with other means. But that's about the only bonus it brings in most situations.

eadmund an hour ago | parent | prev [-]

> The point of Kubernetes isn't to make those things possible, it's to make them easy and consistent.

Kubernetes definitely makes things consistent, but I do not think that it makes them easy.

There’s certainly a lot to learn from Kubernetes, but I strongly believe that a more tasteful successor is possible, and I hope that it is inevitable.

threeseed 11 hours ago | parent | prev | next [-]

Kubernetes is boring tech as well.

And the advantage of it is one way to manage resources, scaling, logging, observability, hardware etc.

All of which is stored in Git and so audited, reviewed, versioned, tested etc in exactly the same way.

andreasmetsala 4 hours ago | parent | prev [-]

> But it is complex, mostly unnecessarily so

Unnecessary complexity sounds like something that should be fixed. Can you give an example?

otabdeveloper4 11 hours ago | parent | prev | next [-]

Kubernetes is great example of the "second-system effect".

Kubernetes only works if you have a webapp written in a slow interpreted language. For anything else it is a huge impedance mismatch with what you're actually trying to do.

P.S. In the real world, Kubernetes isn't used to solve technical problems. It's used as a buffer between the dev team and the ops team, who usually have different schedules/budgets, and might even be different corporate entities. I'm sure there might be an easier way to solve that problem without dragging in Google's ridiculous and broken tech stack.

mrweasel 8 hours ago | parent | next [-]

> It's used as a buffer between the dev team and the ops team, who usually have different schedules/budgets

That depends on your definition. If the ops team is solely responsibly for running the Kubernetes cluster, then yes. In reality that's rarely how things turns out. Developers want Kubernetes, because.... I don't know. Ops doesn't even want Kubernetes in many cases. Kubernetes is amazing, for those few organisations that really need it.

My rule of thumb is: If your worker nodes aren't entire physical hosts, then you might not need Kubernetes. I've seen some absolutely crazy setups where developers had designed this entire solution around Kubernetes, only to run one or two containers. The reasoning is pretty much always the same, they know absolutely nothing about operations, and fail to understand that load balancers exists outside of Kubernetes, or that their solution could be an nginx configuration, 100 lines of Python and some systemd configuration.

I accept that I lost the fight that Kubernetes is overly complex and a nightmare to debug. In my current position I can even see some advantages to Kubernetes, so I was at least a little of in my criticism. Still I don't think Kubernetes should be your default deployment platform, unless you have very specific needs.

rixed 9 hours ago | parent | prev | next [-]

Contrary to popular belief, k8s is not Google's tech stack.

My understanding is that it was initially sold as Google's tech to benefit from Google's tech reputation (exploiting the confusion caused by the fact that some of the original k8s devs where ex-googlers), and today it's also Google trying to pose as k8s inventor, to benefit from its popularity. Interesting case of host/parasite symbiosis, it seams.

Just my impression though, I can be wrong, please comment if you know more about the history of k8s.

jonasdegendt 7 hours ago | parent [-]

Is there anyone that works at Google that can confirm this?

What's left of Borg at Google? Did the company switch to the open source Kubernetes distribution at any point? I'd love to know more about this as well.

> exploiting the confusion caused by the fact that some of the original k8s devs where ex-googlers

What about the fact that many active Kubernetes developers, are also active Googlers?

fragmede 6 hours ago | parent [-]

Borg isn't going anywhere, Kubernetes isn't Google-scale

maxdo 10 hours ago | parent | prev [-]

kubernetes is an API for your cluster, that is portable between providers, more or less. there are other abstractions, but they are not portable, e.g. fly.io, DO etc. so unless you want a vendor lock-in, you need it. for one of my products, I had to migrate due to business reasons 4 times into different kube flavors, from self-manged ( 2 times ) to GKE and EKS.

otabdeveloper4 10 hours ago | parent [-]

> there are other abstractions, but they are not portable

Not true. Unix itself is an API for your cluster too, like the original post implies.

Personally, as a "tech lead" I use NixOS. (Yes, I am that guy.)

The point is, k8s is a shitty API because it's built only for Google's "run a huge webapp built on shitty Python scripts" use case.

Most people don't need this, what they actually want is some way for dev to pass the buck to ops in some way that PM's can track on a Gantt chart.

signal11 15 hours ago | parent | prev | next [-]

> If you answer yes to many of those questions there's really no better alternative than k8s.

This is not even close to true with even a small number of resources. The notion that k8s somehow is the only choice is right along the lines of “Java Enterprise Edition is the only choice” — ie a real failure of the imagination.

For startups and teams with limited resources, DO, fly.io and render are doing lots of interesting work. But what if you can’t use them? Is k8s your only choice?

Let’s say you’re a large orgs with good engineering leadership, and you have high-revenue systems where downtime isn’t okay. Also for compliance reasons public cloud isn’t okay.

DNS in a tightly controlled large enterprise internal network can be handled with relatively simple microservices. Your org will likely have something already though.

Dev/Stage/Production: if you can spin up instances on demand this is trivial. Also financial services and other regulated biz have been doing this for eons before k8s.

Load Balancers: lots of non-k8s options exist (software and hardware appliances).

Prometheus / Grafana (and things like Netdata) work very well even without k8s.

Load Balancing and Ingress is definitely the most interesting piece of the puzzle. Some choose nginx or Envoy, but there’s also teams that use their own ingress solution (sometimes open-sourced!)

But why would a team do this? Or more appropriately, why would their management spend on this? Answer: many don’t! But for those that do — the driver is usually cost*, availability and accountability, along with engineering capability as a secondary driver.

(*cost because it’s easy to set up a mixed ability team with experienced, mid-career and new engineers for this. You don’t need a team full of kernel hackers.)

It costs less than you think, it creates real accountability throughout the stack and most importantly you’ve now got a team of engineers who can rise to any reasonable challenge, and who can be cross pollinated throughout the org. In brief the goal is to have engineers not “k8s implementers” or “OpenShift implementers” or “Cloud Foundry implementers”.

lmm 10 hours ago | parent [-]

> DNS in a tightly controlled large enterprise internal network can be handled with relatively simple microservices. Your org will likely have something already though.

And it will likely be buggy with all sorts of edge cases.

> Dev/Stage/Production: if you can spin up instances on demand this is trivial. Also financial services and other regulated biz have been doing this for eons before k8s.

In my experience financial services have been notably not doing it.

> Load Balancers: lots of non-k8s options exist (software and hardware appliances).

The problem isn't running a load balancer with a given configuration at a given point in time. It's how you manage the required changes to load balancers and configuration as time goes on. It's very common for that to be a pile of perl scripts that add up to an ad-hoc informally specified bug-ridden implementation of half of kubernetes.

signal11 9 hours ago | parent [-]

> And it will likely be buggy with all sorts of edge cases.

I have seen this view in corporate IT teams who’re happy to be “implementers” rather than engineers.

In real life, many orgs will in fact have third party vendor products for internal DNS and cert authorities. Writing bridge APIs to these isn’t difficult and it keeps the IT guys happy.

A relatively few orgs have written their own APIs, typically to manage a delegated zone. Again, you can say these must be buggy, but here’s the thing — everything’s buggy. Including k8s. As long as bugs are understood and fixed, no one cares. The proof of the pudding is how well it works.

Internal DNS in particular is easy enough to control and test if you have engineers (vs implementers) in your team.

> manage changes to load balancers … perl

That’s a very black and white view, that teams are either on k8s (which to you is the bees knees) or a pile of Perl (presumably unmaintainable). Speaks to interesting unconscious bias.

Perhaps it comes from personal experience, in which case I’m sorry you had to be part of such a team. But it’s not particularly difficult to follow modern best practices and operate your own stack.

But if your starter stance is that “k8s is the only way”, no one can talk you out of your own mental hard lines.

lmm 8 hours ago | parent [-]

> Again, you can say these must be buggy, but here’s the thing — everything’s buggy. Including k8s. As long as bugs are understood and fixed, no one cares.

Agreed, but internal products are generally buggier, because an internal product is in a kind of monopoly position. You generally want to be using a product that is subject to competition, that is a profit center rather than a cost center for the people who are making it.

> Internal DNS in particular is easy enough to control and test if you have engineers (vs implementers) in your team.

Your team probably aren't DNS experts, and why should they be? You're not a DNS company. If you could make a better DNS - or a better DNS-deployment integration - than the pros, you'd be selling it. (The exception is if you really are a DNS company, either because you actually do sell it, or because you have some deep integration with DNS that enables your competitive advantage)

> Perhaps it comes from personal experience, in which case I’m sorry you had to be part of such a team. But it’s not particularly difficult to follow modern best practices and operate your own stack.

I'd say that's a contradiction in terms, because modern best practice is to not run your own stack.

I don't particularly like kubernetes qua kubernetes (indeed I'd generally pick nomad instead). But I absolutely do think you need a declarative, single-source-of-truth way of managing your full deployment, end-to-end. And if your deployment is made up of a standard load balancer (or an equivalent of one), a standard DNS, and prometheus or grafana, then you've either got one of these products or you've got an internal product that does the same thing, which is something I'm extremely skeptical of for the same reason as above - if your company was capable of creating a better solution to this standard problem, why wouldn't you be selling it? (And if an engineer was capable of creating a better solution to this standard problem, why would they work for you rather than one of the big cloud corps?)

In the same way I'm very skeptical of any company with an "internal cloud" - in my experience such a thing is usually a significantly worse implementation of AWS, and, yes, is usually held together with some flaky Perl scripts. Or an internal load balancer. It's generally NIH, or at best a cost-cutting exercise which tends to show; a company might have an internal cloud that's cheaper than AWS (I've worked for one), but you'll notice the cheapness.

Now again, if you really are gaining a competitive advantage from your things then it may make sense to not use a standard solution. But in that case you'll have something deeply integrated, i.e. monolithic, and that's precisely the case where you're not deploying separate standard DNS, separate standard load balancers, separate standard monitoring etc.. And in that case, as grandparent said, not using k8s makes total sense.

But if you're just deploying a standard Rails (or what have you) app with a standard database, load balancer, DNS, monitoring setup? Then 95% of the time your company can't solve that problem better than the companies that are dedicated to solving that problem. Either you don't have a solution at all (beyond doing it manually), you use k8s or similar, or you NIH it. Writing custom code to solve custom problems can be smart, but writing custom code to solve standard problems usually isn't.

fragmede 6 hours ago | parent [-]

> if your company was capable of creating a better solution to this standard problem, why wouldn't you be selling it?

Let's pretend I'm the greatest DevOps software developer engineer ever, and I write a Kubernetes replacement that's 100x better. Since it's 100x better, I simply charge 100x as much as it costs per CPU/RAM for a Kubernetes license to a 1,000 customers, and take all of that money to the bank and I deposit my check for $0.

I don't disagree with the rest of the comment, but the market for the software to host a web app is a weird market.

zug_zug 16 hours ago | parent | prev | next [-]

> If you answer yes to many of those questions there's really no better alternative than k8s.

Nah, most of that list is basically free for any company that uses an amazon loadbalancer and an autoscale group. In terms of likeliness of incidents, time, and cost, those will each be an order of magnitude higher with a team of kubernetes engineers than less complex setup.

psychoslave 7 hours ago | parent | prev [-]

Oz Nova nailed it nicely in "You Are Not Google"

https://blog.bradfieldcs.com/you-are-not-google-84912cf44afb

a2tech 17 hours ago | parent | prev | next [-]

People really underestimate the power of a shell scripts and ssh and trusted developers.

stevefan1999 12 hours ago | parent [-]

Besides the fact that shell scripts aren't scalable (in terms of horizontal scalability like actor model), I would also like to point out that shell scripts should be simple, but if you want to handle something that big, you essentially and definitely is using it as a programming language in disguise -- not ideal and I would like to go Go or Rust instead.

llm_trw 11 hours ago | parent | next [-]

We don't live in 1999 any more. A big machine with a database can serve ervyone in the US and I can fit it in my closet.

It's like people are stuck in the early 2000s when they start thinking about computer capabilities. Today I have more flops in a single GPU under my desk than did the worlds largest super computer in 2004.

59nadir 11 hours ago | parent [-]

> It's like people are stuck in the early 2000s when they start thinking about computer capabilities.

This makes sense, because the code people write makes machines feel like they're from the early 2000's.

This is partially a joke, of course, but I think there is a massive chasm between the people who think you immediately need several computers to do things for anything other than redundancy, and the people who see how ridiculously much you can do with one.

Nextgrid 2 hours ago | parent | next [-]

> It's like people are stuck in the early 2000s when they start thinking about computer capabilities.

Partly because the "cloud" makes all its money renting you 2010s-era hardware at inflated prices, and people are either too naive or their career is so invested in it that they can't admit to being ripped off and complicit of the scam.

llm_trw an hour ago | parent [-]

That's what gets me about AWS.

When it came out in 2006 the m1.small was about what you'd get on a mid range desktop at that point. It cost $876 a year [0]. Today for an 8 core machine with 32 gb ram you'll pay $3145.19 [1].

It used to take 12-24 months for you to pay enough AWS bills that it would make sense to buy the hardware outright. Now it's 3 months or less for every category and people still defend this. For ML work stations it's weeks.

[0] https://aws.amazon.com/blogs/aws/dropping-prices-again-ec2-r...

[1] https://instances.vantage.sh/aws/ec2/m8g.2xlarge?region=us-e...

Aeolun 7 hours ago | parent | prev | next [-]

I added performance testing to all our endpoints from the start, so that people don’t start to normalize those 10s response times that our last system had (cry)

AtlasBarfed 3 hours ago | parent | prev [-]

Well that's what happens when you move away from compiled languages to interpreted.

dgfitz 12 hours ago | parent | prev [-]

> Besides the fact that shell scripts aren't scalable…

What are you trying to say there? My understanding is that, way under the hood, a set of shell scripts is in fact enabling the scalable nature of… the internet.

ChoHag 11 hours ago | parent | next [-]

[dead]

stevefan1999 12 hours ago | parent | prev | next [-]

...that's only for early internet, and the early internet is effing broken at best

lmm 10 hours ago | parent | prev [-]

> My understanding is that, way under the hood, a set of shell scripts is in fact enabling the scalable nature of… the internet.

I sure hope not. The state of error handling in shell scripts alone is enough to disqualify them for serious production systems.

If you're extremely smart and disciplined it's theoretically possible to write a shell script that handles error states correctly. But there are better things to spend your discipline budget on.

dgfitz an hour ago | parent [-]

My half tongue-in-cheek comment was implying things like "you can't boot a linux/bsd box without shell scripts" which would make the whole "serving a website" bit hard.

I realize that there exists OS's that are an exception to this rule. I didn't understand the comment about scripts scaling. It's a script, it can do whatever you want.

starttoaster 14 hours ago | parent | prev | next [-]

On the other hand, my team slapped 3 servers down in a datacenter, had each of them configured in a Proxmox cluster within a few hours. Some 8-10 hours later we had a fully configured kubernetes cluster running within Proxmox VMs, where the VMs and k8s cluster are created and configured using an automation workflow that we have running in GitHub Actions. An hour or two worth of work later we had several deployments running on it and serving requests.

Kubernetes is not simple. In fact it's even more complex than just running an executable with your linux distro's init system. The difference in my mind is that it's more complex for the system maintainer, but less complex for the person deploying workloads to it.

And that's before exploring all the benefits of kubernetes-ecosystem tooling like the Prometheus operator for k8s, or the horizontally scalable Loki deployments, for centrally collecting infrastructure and application metrics, and logs. In my mind, making the most of these kinds of tools, things start to look a bit easier even for the systems maintainers.

Not trying to discount your workplace too much. But I'd wager there's a few people that are maybe not owning up to the fact that it's their first time messing around with kubernetes.

regularfry 7 hours ago | parent [-]

As long as your organisation can cleanly either a) split the responsibility for the platform from the responsibility for the apps that run on it, and fund it properly, or b) do the exact opposite and accommodate all the responsibility for the platform into the app team, I can see it working.

The problems start when you're somewhere between those two points. If you've got a "throw it over the wall to ops" type organisation, it's going to go bad. If you've got an underfunded platform team so the app team has to pick up some of the slack, it's going to go bad. If the app team have to ask permission from the platform team before doing anything interesting, it's going to go bad.

The problem is that a lot of organisations will look at k8s and think it means something it doesn't. If you weren't willing to fund a platform team before k8s, I'd be sceptical that moving to it is going to end well.

loftsy a day ago | parent | prev | next [-]

Are you self hosting kubernetes or running it managed?

I've only used it managed. There is a bit of a learning curve but it's not so bad. I can't see how it can take 4 months to figure it out.

zug_zug a day ago | parent | next [-]

We are using EKS

> I can't see how it can take 4 months to figure it out.

Well have you ever tried moving a company with a dozen services onto kubernetes piece-by-piece, with zero downtime? How long would it take you to correctly move and test every permission, environment variable, and issue you run into?

Then if you get a single setting wrong (e.g. memory size) and don't load-test with realistic traffic, you bring down production, potentially lose customers, and have to do a public post-mortem about your mistakes? [true story for current employer]

I don't see how anybody says they'd move a large company to kubernetes in such an environment in a few months with no screwups and solid testing.

sethammons a day ago | parent | next [-]

Took us three-four years to go from self hosted multi-dc to getting the main product almost fully in k8s (some parts didn't make sense in k8s and was pushed to our geo-distributed edge nodes). Dozens of services and teams and keeping the old stuff working while changing the tire on the car while driving. All while the company continues to grow and scale doubles every year or so. It takes maturity in testing and monitoring and it takes longer that everyone estimates

Cpoll 20 hours ago | parent | prev | next [-]

It sounds like it's not easy to figure out the permissions, envvars, memory size, etc. of your existing system, and that's why the migration is so difficult? That's not really one of Kubernetes' (many) failings.

Vegenoid 19 hours ago | parent | next [-]

Yes, and now we are back at the ancestor comment’s original point: “at the end of the day kubernetes feels like complexity trying to abstract over complexity, and often I find that's less successful that removing complexity in the first place”

Which I understand to mean “some people think using Kubernetes will make managing a system easier, but it often will not do that”

Pedro_Ribeiro 19 hours ago | parent | prev [-]

Can you elaborate on other things you think Kubernetes gets wrong? Asking out of curiosity because I haven't delved deep into it.

tail_exchange a day ago | parent | prev | next [-]

It largely depends how customized each microservice is, and how many people are working on this project.

I've seen migrations of thousands of microservices happening with the span of two years. Longer timeline, yes, but the number of microservices is orders of magnitude larger.

Though I suppose the organization works differently at this level. The Kubernetes team build a tool to migrate the microservices, and each owner was asked to perform the migration themselves. Small microservices could be migrated in less than three days, while the large and risk-critical ones took a couple weeks. This all happened in less than two years, but it took more than that in terms of engineer/weeks.

The project was very successful though. The company spends way less money now because of the autoscaling features, and the ability to run multiple microservices in the same node.

Regardless, if the company is running 12 microservices and this number is expected to grow, this is probably a good time to migrate. How did they account for the different shape of services (stateful, stateless, leader elected, cron, etc), networking settings, styles of deployment (blue-green, rolling updates, etc), secret management, load testing, bug bashing, gradual rollouts, dockerizing the containers, etc? If it's taking 4x longer than originally anticipated, it seems like there was a massive failure in project design.

hedora a day ago | parent [-]

2000 products sounds like you made 2000 engineers learn kubernetes (a week, optimistically, 2000/52 = 38 engineer years, or roughly one wasted career).

Similarly, the actual migration times you estimate add up to decades of engineer time.

It’s possible kubernetes saves more time than using the alternative costs, but that definitely wasn’t the case at my previous two jobs. The jury is out at the current job.

I see the opportunity cost of this stuff every day at work, and am patiently waiting for a replacement.

tail_exchange a day ago | parent | next [-]

> 2000 products sounds like you made 2000 engineers learn kubernetes (a week, optimistically, 2000/52 = 38 engineer years, or roughly one wasted career).

Not really, they only had to use the tool to run the migration and then validate that it worked properly. As the other commenter said, a very basic setup for kubernetes is not that hard; the difficult set up is left to the devops team, while the service owners just need to see the basics.

But sure, we can estimate it at 38 engineering years. That's still 38 years for 2,000 microservices; it's way better than 1 year for 12 microservices like in OP's case. Savings that we got was enough to offset these 38 years of work, so this project is now paying dividends.

mschuster91 a day ago | parent | prev [-]

> 2000 products sounds like you made 2000 engineers learn kubernetes (a week, optimistically, 2000/52 = 38 engineer years, or roughly one wasted career).

Learning k8s enough to be able to work with it isn't that hard. Have a centralized team write up a decent template for a CI/CD pipeline, Dockerfile for the most common stacks you use and a Helm chart with an example for a Deployment, PersistentVolumeClaim, Service and Ingress, distribute that, and be available for support should the need for Kubernetes be beyond "we need 1-N pods for this service, they got some environment variables from which they are configured, and maybe a Secret/ConfigMap if the application rather wants configuration to be done in files" is enough in my experience.

relaxing a day ago | parent [-]

> Learning k8s enough to be able to work with it isn't that hard.

I’ve seen a lot of people learn enough k8s to be dangerous.

Learning it well enough to not get wrapped around the axle with some networking or storage details is quite a bit harder.

mschuster91 a day ago | parent [-]

For sure but that's the job of a good ops department - where I work at for example, every project's CI/CD pipeline has its own IAM user mapping to a Kubernetes role that only has explicitly defined capabilities: create, modify and delete just the utter basics. Even if they'd commit something into the Helm chart that could cause an annoyance, the service account wouldn't be able to call the required APIs. And the templates themselves come with security built-in - privileges are all explicitly dropped, pod UIDs/GIDs hardcoded to non-root, and we're deploying Network Policies at least for ingress as well now. Only egress network policies aren't available, we haven't been able to make these work with services.

Anyone wishing to do stuff like use the RDS database provisioner gets an introduction from us on how to use it and what the pitfalls are, and regular reviews of their code. They're flexible but we keep tabs on what they're doing, and when they have done something useful we aren't shy from integrating whatever they have done to our shared template repository.

zdragnar a day ago | parent | prev | next [-]

Comparing the simplicity of two PHP servers against a setup with a dozen services is always going to be one sided. The difference in complexity alone is massive, regardless of whether you use k8s or not.

My current employer did something similar, but with fewer services. The upshot is that with terraform and helm and all the other yaml files defining our cluster, we have test environments on demand, and our uptime is 100x better.

loftsy a day ago | parent | prev | next [-]

Fair enough that sounds hard.

Memory size is an interesting example. A typical Kubernetes deployment has much more control over this than a typical non-container setup. It is costing you to figure out the right setting but in the long term you are rewarded with a more robust and more re-deployable application.

otabdeveloper4 10 hours ago | parent [-]

> has much more control over this than a typical non-container setup

Actually not true, k8s uses the exact same cgroups API for this under the hood that systemd does.

jrs235 a day ago | parent | prev | next [-]

> I don't see how anybody says they'd move a large company to kubernetes in such an environment in a few months with no screwups and solid testing.

Unfortunately, I do. Somebody says that when the culture of the organization expects to be told and hear what they want to hear rather than the cold hard truth. And likely the person saying that says it from a perch up high and not responsible for the day to day work of actually implementing the change. I see this happen when the person, management/leadership, lacks the skills and knowledge to perform the work themselves. They've never been in the trenches and had to actually deal face to face with the devil in the details.

malux85 20 hours ago | parent | prev [-]

Canary deploy dude (or dude-ette), route 0.001% of service traffic and then slowly move it over. Then set error budgets. Then a bad service wont "bring down production".

Thats how we did it at Google (I was part of the core team responsible for ad serving infra - billions of ads to billions of users a day)

pclmulqdq a day ago | parent | prev [-]

Using microk8s or k3s on one node works fine. As the author of "one big server," I am now working on an application that needs some GPUs and needs to be able to deploy on customer hardware, so k8s is natural. Our own hosted product runs on 2 servers, but it's ~10 containers (including databases, etc).

jrockway 16 hours ago | parent [-]

Yup, I like this approach a lot. With cloud providers considering VMs durable these days (they get new hardware for your VM if the hardware it's on dies, without dropping any TCP connections), I think a 1 node approach is enough for small things. You can get like 192 vCPUs per node. This is enough for a lot of small companies.

I occasionally try non-k8s approaches to see what I'm missing. I have a small ARM machine that runs Home Assistant and some other stuff. My first instinct was to run k8s (probably kind honestly), but didn't really want to write a bunch of manifests and let myself scope creep to running ArgoCD. I decided on `podman generate systemd` instead (with nightly re-pulls of the "latest" tag; I live and die by the bleeding edge). This was OK, until I added zwavejs, and now the versions sometimes get out of sync, which I notice by a certain light switch not working anymore. What I should have done instead was have some sort of git repository where I have the versions of these two things, and to update them atomically both at the exact same time. Oh wow, I really did need ArgoCD and Kubernetes ;)

I get by with podman by angrily ssh-ing in in my winter jacket when I'm trying to leave my house but can't turn the lights off. Maybe this can be blamed on auto-updates, but frankly anything exposed to a network that is out of date is also a risk, so, I don't think you can ever really win.

leetrout a day ago | parent | prev | next [-]

Yea but that doesn't sound shiny on your resume.

txutxu 7 hours ago | parent | next [-]

I never did choose any single thing in my job, just because of how it could look in my resume.

After +20 years of Linux sysadmin/devops, and because a spinal disc herniation last year, now I'm looking for a job.

99% of job offers, will ask for EKS/Kubernetes now.

It's like the VMware of the years 200[1-9], or like the "Cloud" of the years 201[1-9].

I've always specialized in physical datacenters and servers, being it on-premises, colocation, embedded, etc... so I'm out of the market now, at least in Spain (which always goes like 8 years behind the market).

You can try to avoid it, and it's nice when you save thousands of operational/performance/security/etc issues and dollars to your company across the years, and you look like a guru that goes ahead of industry issue to your boss eyes, but, it will make finding a job... 99% harder.

It doesn't matter if you demonstrate the highest level on Linux, scripting, ansible, networking, security, hardware, performance tuning, high availability, all kind of balancers, switching, routing, firewalls, encryption, backups, monitoring, log management, compliance, architecture, isolation, budget management, team management, provider/customer management, debugging, automation, programming full stack, and a long etc. If you say "I never worked with Kubernetes, but I learn fast", with your best sincerity at the interview, then you're automatically out of the process. No matter if you're talking with human resources, a helper of the CTO, or the CTO. You're out.

nine_k a day ago | parent | prev [-]

Depends on what kind of company you want to join. Some value simplicity and efficiency more.

nouripenny 14 hours ago | parent | prev | next [-]

I think porting to k8s can succeed or fail, like any other project. I switched an app that I alone worked on, from Elastic Beanstalk (with Bash), to Kubernetes (with Babashka/Clojure). It didn't seem bad. I think k8s is basically a well-designed solution. I think of it as a declarative language which is sent to interpreters in k8s's control plane.

Obviously, some parts of took a while to figure out. For example, I needed to figure out an AWS security group problem with Ingress objects, that I recall wasn't well-documented. So I think parts of that declarative language can suck, if the declarative parts aren't well factored-out from the imperative parts. Or if the log messages don't help you diagnose errors, or if there isn't some kind of (dynamic?) linter that helps you notice problems quickly

In your team's case, more information seems needed to help us evaluate the problems. Why was it easier before to make testing environments, and harder now?

17 hours ago | parent | prev | next [-]
[deleted]
arkh 8 hours ago | parent | prev [-]

So, my current experience somewhere most old apps are very old school:

- most server software is waaaaaaay out of date so getting a dev / test env is a little harder (like last problem we got was the HAproxy version does not do ECDA keys for ssl certs, which is the default with certbot) - yeah pushing to prod is "easy": FTP directly. But now which version of which files are really in prod? No idea. Yeah when I say old school it's old school before things like Jenkins. - need something done around the servers? That's the OPS team job. Team which also has too much different work to do so now you'll have to wait a week or two for this simple "add an upload file" endpoint to this old API because you need somewhere to put those files.

Now we've started setting up some on-prem k8s nodes for the new developments. Not because we need crazy scaling but so the dev team can do most OPS they need. It takes time to have everything setup but once it started chugging along it felt good to be able to just declare whatever we need and get it. You still need to get the devs to learn k8s which is not fun but that's the life of a dev: learning new things every day.

Also k8s does not do data. You want a database or anything managing files: you want to do most of the job outside k8s.