Remix.run Logo
voidfunc 13 hours ago

I'm always kind of blown away by experiences like this. Admittedly, I've been using Kubernetes since the early days and I manage an Infra team that operates a couple thousand self-managed Kubernetes clusters so... expert blindness at work. Before that I did everything from golden images to pushing changes via rsync and kicking a script to deploy.

Maybe it's because I adopted early and have grown with the technology it all just makes sense? It's not that complicated if you limit yourself to the core stuff. Maybe I need to write a book like "Kubernetes for Greybeards" or something like that.

What does fucking kill me in the Kubernetes ecosystem is the amount of add-on crap that is pitched as "necessary". Sidecars... so many sidecars. Please stop. There's way too much vendor garbage surrounding the ecosystem and dev's rarely stop to think about whether they should deploy something when it's easy as dropping in some YAML and letting the cluster magically run it.

jq-r 10 hours ago | parent | next [-]

Those "necessary" add-ons and sidecars are out of control, but its the people problem. I'm part of the infra team and we manage just couple of k8s clusters, but those are quite big and have very high traffic load. The k8s + terraform code is simple, with no hacks, reliable and easy to upgrade. Our devs love it, we love it too and all of this makes my job pleasant and as little stressful as possible.

But we recently hired a staff engineer to the team (now the most senior) and the guy just cannot rest still. "Oh we need a service mesh because we need visibility! I've been using it on my previous job and its the best thing ever." Even though we have all the visibility/metrics that we need and never needed more than that. Then its "we need a different ingress controller, X is crap Y surely is much better!" etc.

So its not inexperienced engineers wanting newest hotness because they have no idea how to solve stuff with the tools they have, its sometimes senior engineers trying to justify their salary, "seniority" by buying into complexity as they try to make themselves irreplaceable.

ysofunny 7 hours ago | parent | next [-]

> Then its "we need a different ingress controller, X is crap Y surely is much better!" etc.

I regard these as traits of a junior dev. they're thinking technology-first, not problem-first

jppittma 8 hours ago | parent | prev | next [-]

Service mesh is complicated, but the reason you use it integrate services across clusters. That and it has a bunch of useful reverse proxy features. On the other hand, It took me and 2 other guys two evenings of blood, sweat, and tears to understand what the fuck a virtual service actually does.

It’s not strictly necessary, but if you’ve had to put in the work elsewhere, I’d use it.

cyberpunk 8 hours ago | parent | prev | next [-]

To be fair, istio and cilium are extremely useful tools to have under your belt.

There’s always a period of “omgwhat” when new senior engineers join and they want to improve things. There’s a short window between joining and getting bogged into a million projects where this is possible.

Embrace it I recon.

p_l 2 hours ago | parent [-]

Doing it well IMO requires not deploying everything as sidecar but maybe, maybe, deploying it as shared node service.

In fact pretty sure I've read a write up from Alibaba? on huge wins in performance due to moving Istio out of sidecar and into shared node service.

cyberpunk an hour ago | parent [-]

Sure, cilium is also much faster than istio. But I guess it depends on your workload. We don't care all that much about performance vs compliance (non-hft finance transactional stuff) and I think we're doing things reasonably well. :}

p_l 12 minutes ago | parent [-]

I didn't mean replace istio with cilium, I meant running the proxy and routing operations as shared part per node instead of per pod

withinboredom 7 hours ago | parent | prev | next [-]

> So its not inexperienced engineers wanting newest hotness because they have no idea how to solve stuff with the tools they have, its sometimes senior engineers trying to justify their salary, "seniority" by buying into complexity as they try to make themselves irreplaceable.

The grass is always greener where you water it. They joined your company because the grass was greener there than anywhere else they could get an offer at. They want to keep it that way or make it even greener. Assuming that someone is doing something to become 'irreplaceable' is probably not healthy.

zelphirkalt an hour ago | parent | next [-]

They want to make it "greener" for whom? I think that is the question.

monooso 7 hours ago | parent | prev [-]

I really don't understand this comment.

alienchow 8 hours ago | parent | prev | next [-]

How do you scale mTLS ops when the CISO comes knocking?

6 hours ago | parent | prev [-]
[deleted]
carlmr 9 hours ago | parent | prev | next [-]

>expert blindness at work.

>It's not that complicated if you limit yourself to the core stuff.

Isn't this the core problem with a lot of technologies. There's a right way to use it, but most ways are wrong. An expert will not look left and right anymore, but to anyone entering the technology with fresh eyes it's a field with abundance of landmines to navigate around.

It's simply bad UX and documentation. It could probably be better. But now it's too late to change everything because you'd annoy all the experts.

>There's way too much vendor garbage surrounding the ecosystem

Azure has been especially bad in this regard. Poorly documented in all respects, too many confusing UI menus that have similar or same names and do different things. If you use Azure Kubernetes the wrapper makes it much harder to learn the "core essentials". It's better to run minkube and get to know k8s first. Even then a lot of the Azure stuff remains confusing.

wruza 8 hours ago | parent | next [-]

This and a terminology rug pull. You wanted to upload a script and install some deps? Here’s your provisioning genuination frobnicator tutorial, at the end of which you’ll learn how to maintain the coalescing encabulation for your appliance unit schema, which is needed for automatic upload. It always feels like thousands times bigger complexity (just in this part!) than your whole project.

rbanffy 5 hours ago | parent | prev [-]

> There's a right way to use it, but most ways are wrong.

This is my biggest complaint. There is no simple obvious way to set it up. There is no "sane default" config.

> It's better to run minkube and get to know k8s first.

Indeed. It should be trivial to set up a cluster from bare metal - nothing more than a `dnf install` and some other command to configure core functionality and to join machines into that cluster. Even when you go the easy way (with, say, Docker Desktop) you need to do a lot of steps just to have an ingress router.

zelphirkalt an hour ago | parent | next [-]

That is actually what my "try out for a day" experience with Nomad was years ago. Just run the VMs, connect them, and they auto load balance. While it took a week or so to get even the most basic stuff in Kubernetes and not even have 2 hosts in a cluster yet, while having to deal with hundreds of pages of bad documentation.

I think since then the documentation probably has improved. I would hope so. But I will only touch Kubernetes again, when I need to. So maybe on a future job.

p_l 2 hours ago | parent | prev [-]

The easy baremetal cluster these days is k3s.

Includes working out of the box ingress controller.

stiray 12 hours ago | parent | prev | next [-]

I would buy the book. Just translate all "new language" concepts into well known concepts from networking and system administration. It would be best seller.

If I would only have a penny for each time I wasted hours trying to figure out what something in "modern IT" is, just to figure out that I already knew what it is, but it was well hidden under layers of newspeak...

deivid 10 hours ago | parent | next [-]

This[0] is my take on something like that, but I'm no k8s expert -- the post documents my first contact with k8s and what/how I understood these concepts, from a sysadmin/SWE perspective.

[0]: https://blog.davidv.dev/posts/first-contact-with-k8s/

MonkeyClub 6 hours ago | parent [-]

And its related HN thread:

https://news.ycombinator.com/item?id=41093197

radicalbyte 11 hours ago | parent | prev [-]

The book I read on K8S written by a core maintainer made is very clear.

c03 11 hours ago | parent | next [-]

Please don't mention it's name, we don't want anyone else reading it..

radicalbyte 7 hours ago | parent [-]

Kubernetes in Action

(I didn't have access to my email or Amazon account let alone my office when I posted so couldn't check the name of the book).

jpalomaki 11 hours ago | parent | prev | next [-]

Is it this one, Kubernetes: Up and Running, 3rd Edition by Brendan Burns, Joe Beda, Kelsey Hightower, Lachlan Evenson from 2022? https://www.oreilly.com/library/view/kubernetes-up-and/97810...

(edit: found the 3rd edition)

radicalbyte 7 hours ago | parent [-]

No, Kubernetes in Action, but that book was also on my radar (mainly as Kelsey Hightower's name reminds me of the Police Academy films I loved as a kid).

schnirz 11 hours ago | parent | prev [-]

Which book would that be, out of interest?

radicalbyte 7 hours ago | parent [-]

Kubernetes in Action

t-writescode 12 hours ago | parent | prev | next [-]

> Admittedly, I've been using Kubernetes since the early days and I manage an Infra team

I think this is where the big difference is. If you're leading a team and introduced all good practices from the start, then the k8s and Terraform or whatever config files can never get so very complicated that a Gordian knot isn't created.

Perhaps k8s is nice and easy to use - many of the commands certainly are, in my experience.

Developers have, over years and decades, learned how to navigate code and hop from definition to definition, climbing the tree and learning the language they're operating in, and most of the languages follow similar-enough patterns that they can crawl around.

Configuring a k8s cluster has absolutely none of that knowledge built up; and, reading something that has rough practices is not a good place to learn what it should look like.

paddy_m 8 hours ago | parent [-]

Thank you. I can always xargs grep for a function name, at worst. Dir() in python at a debugger for other things. With YAML, kubernetes and other devops hotness, I frequently can’t even find the relevant scripts/YAML that are executed nor their codebases.

This also happens with configuration based packaging setups. Python hatch in particular, but sometimes node/webpack/rollup/vite.

thiht an hour ago | parent | prev | next [-]

> It's not that complicated if you limit yourself to the core stuff

Completely agree. I use Kubernetes (basically just Deployments and CronJobs) because it makes deployments simple, reliable and standard, for a relatively low cost (assuming that I use a managed Kubernetes like GKE where I don’t need to care at all about the Kubernetes engine). Using Kubernetes as a developer is really not that hard, and it gives you no vendor lock-in in practice (every cloud provider has a Kubernetes offer) and easy replication.

It’s not the only solution, not the simplest either, but it’s a good one. And if you already know Kubernetes, it doesn’t cost much to just use it.

karmarepellent 10 hours ago | parent | prev | next [-]

Agreed. The best thing we did back when we ran k8s clusters, was moving a few stateful services to dedicated VMs and keep the clusters for stateless services (the bulk) only. Running k8s for stateless services was an absolute bliss.

At that time stateful services were somewhat harder to operate on k8s because statefulness (and all that it encapsulates) was kinda full of bugs. That may certainly have changed over the last few years. Maybe we just did it wrong. In any case if you focused on the core parts of k8s that were mature back then, k8s was (and is) a good thing.

figassis 12 hours ago | parent | prev | next [-]

This, all the sidecars. Use kubernetes to run your app like you would without it, take advantage of the flexibility, avoid the extra complexity. Service discovery sidecars? Why not just use the out of the box dns features?

tommica 12 hours ago | parent [-]

Because new people don't know better - I've never used k8s, but have seen sidecars being promoted as a good thing, so I might have used them

namaria 6 hours ago | parent [-]

Maybe the "I've heard about" approach to tooling is the problem here?

tombert 2 hours ago | parent | prev | next [-]

If I had set up the infrastructure myself I'd probably have a different opinion on all of this stuff, but I came into this job where everything was set up for me. I don't know if it was don't "incorrectly", and I do fear that stating as such might get into territory adjacent to the "no true Scotsman" fallacy.

I mostly just think that k8s integration with GCP is a huge pain in the ass, every time I have to touch it it's the worst part of me day.

fragmede 2 hours ago | parent [-]

What about your integration makes it a huge pita?

tombert 2 hours ago | parent [-]

It's just a lot of stuff; we have a couple hundred services, and when I've had to add shit, it ends up with me updating like two hundred files.

Infrastructure as code is great, but lets be honest, most people are not thoroughly reading through a PR with 200+ files.

There's of course tpl files to help reduce duplication, and I'm grateful for that stuff when I can get it, but for one reason or another, I can't always do that.

It's also not always clear to me which YAML corresponds to which service, though I think that might be more of an issue with our individual setup.

pas 9 hours ago | parent | prev | next [-]

My problem is the brittleness.

Sure, I am playing with fire (k3s, bare metal, cilium, direct assigned IP to Ingresses), but a few weeks ago on one cluster suddenly something stopped working in the external IP -> internal cluster IP network path. (And after a complete restart things got worse. Oops. Well okay time to test backups.)

coldtea 7 hours ago | parent | prev | next [-]

>Before that I did everything from golden images to pushing changes via rsync and kicking a script to deploy.

Sounds like a great KISS solution. Why did it regress into Kubernetes?

mitjam 11 hours ago | parent | prev | next [-]

Absolutely, a reasonably sophisticated scalable app platform looks like a half-baked and undocumented reimagination of Kubernetes.

Admittedly: The ecosystem is huge and less is more in most cases, but the foundation is cohesive and sane.

Would love to read the k8s for greybeards book.

coldtea 7 hours ago | parent [-]

>Absolutely, a reasonably sophisticated scalable app platform looks like a half-baked and undocumented reimagination of Kubernetes.

Or maybe Kubernetes looks like a committee designed, everything and the kitchen sink, over-engineered, second system effect, second system effect suffering, YAGNI P.O.S., that only the kind of "enterprise" mindset that really enjoyed J2EE in 2004 and XML/SOAP vs JSON/REST would love...

p_l 6 hours ago | parent [-]

You misspelled DCOS /s

cryptonym 9 hours ago | parent | prev | next [-]

K8S is a beast and the ecosystem is wild. Newcomer won't know how to proceed to keep things simple, while still understanding everything that is being used.

BiteCode_dev 10 hours ago | parent | prev | next [-]

Would buy. But you probably should teach a few live courses before writing it because of expert blindness. Otherwise you will miss thr mark.

Would pay for a decent remote live course intro.

awestroke 10 hours ago | parent | prev | next [-]

I would buy that book in a heartbeat. All documentation and guides on kubernetes seem to assume you already know why things are done a certain way

DanielHB 10 hours ago | parent | prev | next [-]

> What does fucking kill me in the Kubernetes ecosystem is the amount of add-on crap that is pitched as "necessary". Sidecars... so many sidecars.

Yeah it is the same with terraform modules, I was trying to argument at a previous job that we should stick to a single module (the cloud provider module) but people just love adding crap if it saves them 5 lines of configuration. Said crap of course adding tons of unnecessary resources in your cloud that no one understands.

theptrk 12 hours ago | parent | prev | next [-]

I would pay for the outline of this book.

quietbritishjim 8 hours ago | parent | prev | next [-]

I hardly know the first thing about Kubernetes or cloud, so maybe you can help explain something to me:

There's another Kubernetes post on the front page of HN at the moment, where they complain it's too complex and they had to stop using it. The comments are really laying into the article author because they used almost 50 clusters. Of course they were having trouble, the comments say, if you introduce that much complexity. They should only need one single cluster (maybe also a backup and a dev one at most). That's the whole point.

But then here you are saying your team "operates a couple thousand" clusters. If 50 is far too many, and bound to be unmanageable, how is it reasonable to have more than a thousand?

voidfunc 5 hours ago | parent | next [-]

> But then here you are saying your team "operates a couple thousand" clusters. If 50 is far too many, and bound to be unmanageable, how is it reasonable to have more than a thousand?

It's not unmanageable to have a couple thousand Kube clusters but you need to have the resources to build a staff and tool chain to support that, which most companies cannot do.

Clusters are how we shard our customer workloads (a workload being say a dozen services and a database, a customer may have many workloads spread across the entire fleet). We put between 100 and 150 workloads per cluster. What this gives us is a relatively small impact area if a single cluster becomes problematic as it only impacts the workloads on it.

jalk 5 hours ago | parent | prev [-]

Sounds like it's their primary job is to manage clusters for others, which ofc is different from trying to manage your primary service, that you deployed as 50 microservices in individual clusters (didn't read the other article)

adastra22 12 hours ago | parent | prev | next [-]

I would buy that book.

AtlasBarfed 12 hours ago | parent | prev | next [-]

So you don't run any databases in those thousands of clusters?

To your point, and I have not used k8s I just started to research it when my former company was thinking about shoehorning cassandra into k8s...

But there was dogma around not allowing access to VM command exec via kubectl, while I basically needed it in the basic mode for certain one-off diagnosis needs and nodetool stuff...

And yes, some of the floated stuff was "use sidecars" which also seemed to architect complexity for dogma's sake.

voidfunc 12 hours ago | parent | next [-]

> So you don't run any databases in those thousands of clusters?

We do, but not of the SQL variety (that I am aware of). We have persistent key-value and document store databases hosted in these clusters. SQL databases are off-loaded to managed offering's in the cloud. Admittedly, this does simplify a lot of problems for us.

tayo42 12 hours ago | parent [-]

How much data? I keep hearing k8s isn't usable becasue sometimes there is to much data and it can't be moved around.

darkstar_16 10 hours ago | parent | next [-]

In the managed k8s space, the data is on a PVC in the same availability zone as the node it is being mounted on. If the node dies, the volume is just mounted on to a new node in the same zone. There is no data movement required.

eek04_ 9 hours ago | parent | prev | next [-]

While I've not played with k8, I did run stuff in Google's Borg for a very long while, and that has a similar architecture. My team was petabyte scale and we were far from the team with the largest footprint. So it is clearly possible to handle large scale data in this type of architecture.

pletnes 11 hours ago | parent | prev [-]

The simplest approach I’m aware of is to create the k8s cluster and databases in the same datacenter / availability zone.

pas 8 hours ago | parent | prev [-]

postgresql operators are pretty nice, so it makes sense to run stateful stuff on k8s (ie. for CI, testing, staging, dev, etc.. and probably even for prod if there's a need to orchestrate shards)

> exec

kubectl exec is good, and it's possible to audit access (ie. get kubectl exec events with arguments logged)

and I guess and admissions webhook can filter the allowed commands

but IMHO it's shouldn't be necessary, the bastion host where the "kubectl exec" is run from should be accessible only through an SSH session recorder

khafra 7 hours ago | parent | prev [-]

Container orchestration is second nature to us SRE's, so it's easy to forget that the average dev probably only knows the syntax for deployments and one or two service manifests.

And pods, of course

pdimitar 2 hours ago | parent [-]

A factor for sure, but as a programmer I find that the discoverability of stuff in code is much higher than with k8s.

Give me access to a repo full of YAML files and I'm truly and completely lost and wouldn't even know where to begin.

YAML is simply not the right tool for this job. Sure you got used to it but that's exactly the point: you had to get used to it. It was not intuitive and it did not come naturally to you.