▲ | zug_zug a day ago | ||||||||||||||||||||||||||||||||||||||||
We are using EKS > I can't see how it can take 4 months to figure it out. Well have you ever tried moving a company with a dozen services onto kubernetes piece-by-piece, with zero downtime? How long would it take you to correctly move and test every permission, environment variable, and issue you run into? Then if you get a single setting wrong (e.g. memory size) and don't load-test with realistic traffic, you bring down production, potentially lose customers, and have to do a public post-mortem about your mistakes? [true story for current employer] I don't see how anybody says they'd move a large company to kubernetes in such an environment in a few months with no screwups and solid testing. | |||||||||||||||||||||||||||||||||||||||||
▲ | sethammons a day ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||
Took us three-four years to go from self hosted multi-dc to getting the main product almost fully in k8s (some parts didn't make sense in k8s and was pushed to our geo-distributed edge nodes). Dozens of services and teams and keeping the old stuff working while changing the tire on the car while driving. All while the company continues to grow and scale doubles every year or so. It takes maturity in testing and monitoring and it takes longer that everyone estimates | |||||||||||||||||||||||||||||||||||||||||
▲ | Cpoll 20 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
It sounds like it's not easy to figure out the permissions, envvars, memory size, etc. of your existing system, and that's why the migration is so difficult? That's not really one of Kubernetes' (many) failings. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
▲ | tail_exchange a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
It largely depends how customized each microservice is, and how many people are working on this project. I've seen migrations of thousands of microservices happening with the span of two years. Longer timeline, yes, but the number of microservices is orders of magnitude larger. Though I suppose the organization works differently at this level. The Kubernetes team build a tool to migrate the microservices, and each owner was asked to perform the migration themselves. Small microservices could be migrated in less than three days, while the large and risk-critical ones took a couple weeks. This all happened in less than two years, but it took more than that in terms of engineer/weeks. The project was very successful though. The company spends way less money now because of the autoscaling features, and the ability to run multiple microservices in the same node. Regardless, if the company is running 12 microservices and this number is expected to grow, this is probably a good time to migrate. How did they account for the different shape of services (stateful, stateless, leader elected, cron, etc), networking settings, styles of deployment (blue-green, rolling updates, etc), secret management, load testing, bug bashing, gradual rollouts, dockerizing the containers, etc? If it's taking 4x longer than originally anticipated, it seems like there was a massive failure in project design. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
▲ | zdragnar a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
Comparing the simplicity of two PHP servers against a setup with a dozen services is always going to be one sided. The difference in complexity alone is massive, regardless of whether you use k8s or not. My current employer did something similar, but with fewer services. The upshot is that with terraform and helm and all the other yaml files defining our cluster, we have test environments on demand, and our uptime is 100x better. | |||||||||||||||||||||||||||||||||||||||||
▲ | loftsy a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
Fair enough that sounds hard. Memory size is an interesting example. A typical Kubernetes deployment has much more control over this than a typical non-container setup. It is costing you to figure out the right setting but in the long term you are rewarded with a more robust and more re-deployable application. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
▲ | jrs235 a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
> I don't see how anybody says they'd move a large company to kubernetes in such an environment in a few months with no screwups and solid testing. Unfortunately, I do. Somebody says that when the culture of the organization expects to be told and hear what they want to hear rather than the cold hard truth. And likely the person saying that says it from a perch up high and not responsible for the day to day work of actually implementing the change. I see this happen when the person, management/leadership, lacks the skills and knowledge to perform the work themselves. They've never been in the trenches and had to actually deal face to face with the devil in the details. | |||||||||||||||||||||||||||||||||||||||||
▲ | malux85 20 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||
Canary deploy dude (or dude-ette), route 0.001% of service traffic and then slowly move it over. Then set error budgets. Then a bad service wont "bring down production". Thats how we did it at Google (I was part of the core team responsible for ad serving infra - billions of ads to billions of users a day) |