I'm always kind of blown away by experiences like this. Admittedly, I've been using Kubernetes since the early days and I manage an Infra team that operates a couple thousand self-managed Kubernetes clusters so... expert blindness at work. Before that I did everything from golden images to pushing changes via rsync and kicking a script to deploy.

Maybe it's because I adopted early and have grown with the technology it all just makes sense? It's not that complicated if you limit yourself to the core stuff. Maybe I need to write a book like "Kubernetes for Greybeards" or something like that.

What does fucking kill me in the Kubernetes ecosystem is the amount of add-on crap that is pitched as "necessary". Sidecars... so many sidecars. Please stop. There's way too much vendor garbage surrounding the ecosystem and dev's rarely stop to think about whether they should deploy something when it's easy as dropping in some YAML and letting the cluster magically run it.

▲

jq-r a year ago | parent | next [-]

Those "necessary" add-ons and sidecars are out of control, but its the people problem. I'm part of the infra team and we manage just couple of k8s clusters, but those are quite big and have very high traffic load. The k8s + terraform code is simple, with no hacks, reliable and easy to upgrade. Our devs love it, we love it too and all of this makes my job pleasant and as little stressful as possible.

But we recently hired a staff engineer to the team (now the most senior) and the guy just cannot rest still. "Oh we need a service mesh because we need visibility! I've been using it on my previous job and its the best thing ever." Even though we have all the visibility/metrics that we need and never needed more than that. Then its "we need a different ingress controller, X is crap Y surely is much better!" etc.

So its not inexperienced engineers wanting newest hotness because they have no idea how to solve stuff with the tools they have, its sometimes senior engineers trying to justify their salary, "seniority" by buying into complexity as they try to make themselves irreplaceable.

▲

ysofunny a year ago | parent | next [-]

> Then its "we need a different ingress controller, X is crap Y surely is much better!" etc.

I regard these as traits of a junior dev. they're thinking technology-first, not problem-first

▲

jppittma a year ago | parent | prev | next [-]

Service mesh is complicated, but the reason you use it integrate services across clusters. That and it has a bunch of useful reverse proxy features. On the other hand, It took me and 2 other guys two evenings of blood, sweat, and tears to understand what the fuck a virtual service actually does.

It’s not strictly necessary, but if you’ve had to put in the work elsewhere, I’d use it.

▲

cyberpunk a year ago | parent | prev | next [-]

To be fair, istio and cilium are extremely useful tools to have under your belt.

There’s always a period of “omgwhat” when new senior engineers join and they want to improve things. There’s a short window between joining and getting bogged into a million projects where this is possible.

Embrace it I recon.

▲

p_l a year ago | parent [-]

Doing it well IMO requires not deploying everything as sidecar but maybe, maybe, deploying it as shared node service.

In fact pretty sure I've read a write up from Alibaba? on huge wins in performance due to moving Istio out of sidecar and into shared node service.

▲

cyberpunk a year ago | parent [-]

Sure, cilium is also much faster than istio. But I guess it depends on your workload. We don't care all that much about performance vs compliance (non-hft finance transactional stuff) and I think we're doing things reasonably well. :}

▲

p_l a year ago | parent [-]

I didn't mean replace istio with cilium, I meant running the proxy and routing operations as shared part per node instead of per pod

▲

cyberpunk a year ago | parent [-]

How does that even work with envoy? The magic sauce behind istio is that every connection is terminated using iptables into the envoy process (sidecar), and istiod spaffs envoy configurations around the place based on your vs/dr/pas/access controls etc.

I suppose you could have a giant envoy and have all the proxy-configs all mashed together but I really don't see any benefit to it? I can't even find documentation that says it's possible..

	▲	p_l a year ago \| parent [-]
		Couldn't check all details yet, but from quick recap: It's called ambient mode, and uses separate L4 and L7 processing on ways that would be familiar to people who dealt with virtual network functions - and neither l4 nor l7 parts require sidecar

▲

withinboredom a year ago | parent | prev | next [-]

> So its not inexperienced engineers wanting newest hotness because they have no idea how to solve stuff with the tools they have, its sometimes senior engineers trying to justify their salary, "seniority" by buying into complexity as they try to make themselves irreplaceable.

The grass is always greener where you water it. They joined your company because the grass was greener there than anywhere else they could get an offer at. They want to keep it that way or make it even greener. Assuming that someone is doing something to become 'irreplaceable' is probably not healthy.

▲

monooso a year ago | parent | next [-]

I really don't understand this comment.

▲

zelphirkalt a year ago | parent | prev [-]

They want to make it "greener" for whom? I think that is the question.

	▲	withinboredom a year ago \| parent [-]
		Wherever they came from, I suppose. There’s a reason they left.

▲

alienchow a year ago | parent | prev | next [-]

How do you scale mTLS ops when the CISO comes knocking?

▲

a year ago | parent | prev [-]

[deleted]

▲

stiray a year ago | parent | prev | next [-]

I would buy the book. Just translate all "new language" concepts into well known concepts from networking and system administration. It would be best seller.

If I would only have a penny for each time I wasted hours trying to figure out what something in "modern IT" is, just to figure out that I already knew what it is, but it was well hidden under layers of newspeak...

▲

deivid a year ago | parent | next [-]

This[0] is my take on something like that, but I'm no k8s expert -- the post documents my first contact with k8s and what/how I understood these concepts, from a sysadmin/SWE perspective.

[0]: https://blog.davidv.dev/posts/first-contact-with-k8s/

	▲	ddimitrov a year ago \| parent \| next [-]
		And here is my attempt to describe what Kubernetes is, and the expected benefit and tradeoffs. This is what I wish I had before I started learning. https://www.linkedin.com/pulse/kubernetes-before-you-start-d...
	▲	MonkeyClub a year ago \| parent \| prev [-]
		And its related HN thread: https://news.ycombinator.com/item?id=41093197

▲

radicalbyte a year ago | parent | prev [-]

The book I read on K8S written by a core maintainer made is very clear.

▲

c03 a year ago | parent | next [-]

Please don't mention it's name, we don't want anyone else reading it..

	▲	radicalbyte a year ago \| parent [-]
		Kubernetes in Action (I didn't have access to my email or Amazon account let alone my office when I posted so couldn't check the name of the book).

▲

jpalomaki a year ago | parent | prev | next [-]

Is it this one, Kubernetes: Up and Running, 3rd Edition by Brendan Burns, Joe Beda, Kelsey Hightower, Lachlan Evenson from 2022? https://www.oreilly.com/library/view/kubernetes-up-and/97810...

(edit: found the 3rd edition)

	▲	radicalbyte a year ago \| parent [-]
		No, Kubernetes in Action, but that book was also on my radar (mainly as Kelsey Hightower's name reminds me of the Police Academy films I loved as a kid).

▲

schnirz a year ago | parent | prev [-]

Which book would that be, out of interest?

▲

radicalbyte a year ago | parent [-]

Kubernetes in Action

▲

xorcist a year ago | parent [-]

A book from 2017? Is that still relevant to understand a modern Kubernetes cluster?

The CNCF ecosystem looked a lot different back then.

	▲	ofrzeta a year ago \| parent \| next [-]
		The second edition is being worked on for a long time: https://www.manning.com/books/kubernetes-in-action-second-ed...
	▲	radicalbyte a year ago \| parent \| prev [-]
		Yup it's a great introduction as you can fly through it (3-4 hours to read most of it), at least if you already have a grasp of networking, linux, processes, threads, containers etc. Then you can hit other resources (in my case working with a team who've been using K8S for a few years). If you (or anyone else) has suggestions for something newer and covering more than just the core (like various different components you can use, Helm, Argo, ISTIO etc etc) then I'd appreciate it :-)

▲

carlmr a year ago | parent | prev | next [-]

>expert blindness at work.

>It's not that complicated if you limit yourself to the core stuff.

Isn't this the core problem with a lot of technologies. There's a right way to use it, but most ways are wrong. An expert will not look left and right anymore, but to anyone entering the technology with fresh eyes it's a field with abundance of landmines to navigate around.

It's simply bad UX and documentation. It could probably be better. But now it's too late to change everything because you'd annoy all the experts.

>There's way too much vendor garbage surrounding the ecosystem

Azure has been especially bad in this regard. Poorly documented in all respects, too many confusing UI menus that have similar or same names and do different things. If you use Azure Kubernetes the wrapper makes it much harder to learn the "core essentials". It's better to run minkube and get to know k8s first. Even then a lot of the Azure stuff remains confusing.

▲

wruza a year ago | parent | next [-]

This and a terminology rug pull. You wanted to upload a script and install some deps? Here’s your provisioning genuination frobnicator tutorial, at the end of which you’ll learn how to maintain the coalescing encabulation for your appliance unit schema, which is needed for automatic upload. It always feels like thousands times bigger complexity (just in this part!) than your whole project.

	▲	jterrys a year ago \| parent [-]
		You nailed it. Genuinely the most frustrating part about learning kubernetes... is just realizing that whatever the fuck they're talking about is a fancy wrapper for a concept that's existed since 90s.

▲

rbanffy a year ago | parent | prev [-]

> There's a right way to use it, but most ways are wrong.

This is my biggest complaint. There is no simple obvious way to set it up. There is no "sane default" config.

> It's better to run minkube and get to know k8s first.

Indeed. It should be trivial to set up a cluster from bare metal - nothing more than a `dnf install` and some other command to configure core functionality and to join machines into that cluster. Even when you go the easy way (with, say, Docker Desktop) you need to do a lot of steps just to have an ingress router.

	▲	p_l a year ago \| parent \| next [-]
		The easy baremetal cluster these days is k3s. Includes working out of the box ingress controller.
	▲	zelphirkalt a year ago \| parent \| prev [-]
		That is actually what my "try out for a day" experience with Nomad was years ago. Just run the VMs, connect them, and they auto load balance. While it took a week or so to get even the most basic stuff in Kubernetes and not even have 2 hosts in a cluster yet, while having to deal with hundreds of pages of bad documentation. I think since then the documentation probably has improved. I would hope so. But I will only touch Kubernetes again, when I need to. So maybe on a future job.

▲

t-writescode a year ago | parent | prev | next [-]

> Admittedly, I've been using Kubernetes since the early days and I manage an Infra team

I think this is where the big difference is. If you're leading a team and introduced all good practices from the start, then the k8s and Terraform or whatever config files can never get so very complicated that a Gordian knot isn't created.

Perhaps k8s is nice and easy to use - many of the commands certainly are, in my experience.

Developers have, over years and decades, learned how to navigate code and hop from definition to definition, climbing the tree and learning the language they're operating in, and most of the languages follow similar-enough patterns that they can crawl around.

Configuring a k8s cluster has absolutely none of that knowledge built up; and, reading something that has rough practices is not a good place to learn what it should look like.

	▲	paddy_m a year ago \| parent [-]
		Thank you. I can always xargs grep for a function name, at worst. Dir() in python at a debugger for other things. With YAML, kubernetes and other devops hotness, I frequently can’t even find the relevant scripts/YAML that are executed nor their codebases. This also happens with configuration based packaging setups. Python hatch in particular, but sometimes node/webpack/rollup/vite.

▲

figassis a year ago | parent | prev | next [-]

This, all the sidecars. Use kubernetes to run your app like you would without it, take advantage of the flexibility, avoid the extra complexity. Service discovery sidecars? Why not just use the out of the box dns features?

▲

tommica a year ago | parent [-]

Because new people don't know better - I've never used k8s, but have seen sidecars being promoted as a good thing, so I might have used them

	▲	namaria a year ago \| parent [-]
		Maybe the "I've heard about" approach to tooling is the problem here?

▲

karmarepellent a year ago | parent | prev | next [-]

Agreed. The best thing we did back when we ran k8s clusters, was moving a few stateful services to dedicated VMs and keep the clusters for stateless services (the bulk) only. Running k8s for stateless services was an absolute bliss.

At that time stateful services were somewhat harder to operate on k8s because statefulness (and all that it encapsulates) was kinda full of bugs. That may certainly have changed over the last few years. Maybe we just did it wrong. In any case if you focused on the core parts of k8s that were mature back then, k8s was (and is) a good thing.

▲

AtlasBarfed a year ago | parent | prev | next [-]

So you don't run any databases in those thousands of clusters?

To your point, and I have not used k8s I just started to research it when my former company was thinking about shoehorning cassandra into k8s...

But there was dogma around not allowing access to VM command exec via kubectl, while I basically needed it in the basic mode for certain one-off diagnosis needs and nodetool stuff...

And yes, some of the floated stuff was "use sidecars" which also seemed to architect complexity for dogma's sake.

▲

voidfunc a year ago | parent | next [-]

> So you don't run any databases in those thousands of clusters?

We do, but not of the SQL variety (that I am aware of). We have persistent key-value and document store databases hosted in these clusters. SQL databases are off-loaded to managed offering's in the cloud. Admittedly, this does simplify a lot of problems for us.

▲

tayo42 a year ago | parent [-]

How much data? I keep hearing k8s isn't usable becasue sometimes there is to much data and it can't be moved around.

	▲	darkstar_16 a year ago \| parent \| next [-]
		In the managed k8s space, the data is on a PVC in the same availability zone as the node it is being mounted on. If the node dies, the volume is just mounted on to a new node in the same zone. There is no data movement required.
	▲	eek04_ a year ago \| parent \| prev \| next [-]
		While I've not played with k8, I did run stuff in Google's Borg for a very long while, and that has a similar architecture. My team was petabyte scale and we were far from the team with the largest footprint. So it is clearly possible to handle large scale data in this type of architecture.
	▲	pletnes a year ago \| parent \| prev [-]
		The simplest approach I’m aware of is to create the k8s cluster and databases in the same datacenter / availability zone.

▲

pas a year ago | parent | prev [-]

postgresql operators are pretty nice, so it makes sense to run stateful stuff on k8s (ie. for CI, testing, staging, dev, etc.. and probably even for prod if there's a need to orchestrate shards)

> exec

kubectl exec is good, and it's possible to audit access (ie. get kubectl exec events with arguments logged)

and I guess and admissions webhook can filter the allowed commands

but IMHO it's shouldn't be necessary, the bastion host where the "kubectl exec" is run from should be accessible only through an SSH session recorder

▲

thiht a year ago | parent | prev | next [-]

> It's not that complicated if you limit yourself to the core stuff

Completely agree. I use Kubernetes (basically just Deployments and CronJobs) because it makes deployments simple, reliable and standard, for a relatively low cost (assuming that I use a managed Kubernetes like GKE where I don’t need to care at all about the Kubernetes engine). Using Kubernetes as a developer is really not that hard, and it gives you no vendor lock-in in practice (every cloud provider has a Kubernetes offer) and easy replication.

It’s not the only solution, not the simplest either, but it’s a good one. And if you already know Kubernetes, it doesn’t cost much to just use it.

▲

pas a year ago | parent | prev | next [-]

My problem is the brittleness.

Sure, I am playing with fire (k3s, bare metal, cilium, direct assigned IP to Ingresses), but a few weeks ago on one cluster suddenly something stopped working in the external IP -> internal cluster IP network path. (And after a complete restart things got worse. Oops. Well okay time to test backups.)

▲

mitjam a year ago | parent | prev | next [-]

Absolutely, a reasonably sophisticated scalable app platform looks like a half-baked and undocumented reimagination of Kubernetes.

Admittedly: The ecosystem is huge and less is more in most cases, but the foundation is cohesive and sane.

Would love to read the k8s for greybeards book.

▲

coldtea a year ago | parent [-]

>Absolutely, a reasonably sophisticated scalable app platform looks like a half-baked and undocumented reimagination of Kubernetes.

Or maybe Kubernetes looks like a committee designed, everything and the kitchen sink, over-engineered, second system effect, second system effect suffering, YAGNI P.O.S., that only the kind of "enterprise" mindset that really enjoyed J2EE in 2004 and XML/SOAP vs JSON/REST would love...

	▲	p_l a year ago \| parent [-]
		You misspelled DCOS /s

▲

tombert a year ago | parent | prev | next [-]

If I had set up the infrastructure myself I'd probably have a different opinion on all of this stuff, but I came into this job where everything was set up for me. I don't know if it was don't "incorrectly", and I do fear that stating as such might get into territory adjacent to the "no true Scotsman" fallacy.

I mostly just think that k8s integration with GCP is a huge pain in the ass, every time I have to touch it it's the worst part of me day.

▲

fragmede a year ago | parent [-]

What about your integration makes it a huge pita?

▲

tombert a year ago | parent [-]

It's just a lot of stuff; we have a couple hundred services, and when I've had to add shit, it ends up with me updating like two hundred files.

Infrastructure as code is great, but lets be honest, most people are not thoroughly reading through a PR with 200+ files.

There's of course tpl files to help reduce duplication, and I'm grateful for that stuff when I can get it, but for one reason or another, I can't always do that.

It's also not always clear to me which YAML corresponds to which service, though I think that might be more of an issue with our individual setup.

	▲	fragmede a year ago \| parent [-]
		Yeah ew sounds like a mess. I'm sorry, thanks for sharing. Death by a thousand paper cuts is never fun. The solution in looking at for my problem in this area is to add another layer of indirection and make a config generator so that there's only one config file for users to define their service in, and the program goes off and makes all the necessary changes to the pile of yaml files.

▲

coldtea a year ago | parent | prev | next [-]

>Before that I did everything from golden images to pushing changes via rsync and kicking a script to deploy.

Sounds like a great KISS solution. Why did it regress into Kubernetes?

	▲	fhke a year ago \| parent [-]
		> Why did it regress into Kubernetes The “KISS solution” didn’t scale to the requirements of modern business. I remember running chef - essentially a complicated ruby script - on 100ks of servers, each of which with their own local daemon & a central management plane orchestrating it. The problem was that if a server failed… it failed, alongside everything on it. Compared to that setup, k8s is a godsend - auto healing, immutable deployments, scaling, etc - and ultimately, you were already running a node agent, API, and state store, so the complexity lift wasn’t noticeable. The problem came about when companies who need to run 5 containers ended up deploying a k8s cluster :-)

▲

BiteCode_dev a year ago | parent | prev | next [-]

Would buy. But you probably should teach a few live courses before writing it because of expert blindness. Otherwise you will miss thr mark.

Would pay for a decent remote live course intro.

▲

cryptonym a year ago | parent | prev | next [-]

K8S is a beast and the ecosystem is wild. Newcomer won't know how to proceed to keep things simple, while still understanding everything that is being used.

▲

awestroke a year ago | parent | prev | next [-]

I would buy that book in a heartbeat. All documentation and guides on kubernetes seem to assume you already know why things are done a certain way

▲

DanielHB a year ago | parent | prev | next [-]

> What does fucking kill me in the Kubernetes ecosystem is the amount of add-on crap that is pitched as "necessary". Sidecars... so many sidecars.

Yeah it is the same with terraform modules, I was trying to argument at a previous job that we should stick to a single module (the cloud provider module) but people just love adding crap if it saves them 5 lines of configuration. Said crap of course adding tons of unnecessary resources in your cloud that no one understands.

▲

mmcnl a year ago | parent | prev | next [-]

I didn't grow up in the trenches like you did, but nonetheless I think Kubernetes has a very user-friendly API. Imo Kubernetes is only complex because IT infrastructure is complex. It just makes the complexity transparant and manageable. Kubernetes is a friendly guide helping you navigate the waters.

▲

theptrk a year ago | parent | prev | next [-]

I would pay for the outline of this book.

▲

adastra22 a year ago | parent | prev | next [-]

I would buy that book.

▲

quietbritishjim a year ago | parent | prev | next [-]

I hardly know the first thing about Kubernetes or cloud, so maybe you can help explain something to me:

There's another Kubernetes post on the front page of HN at the moment, where they complain it's too complex and they had to stop using it. The comments are really laying into the article author because they used almost 50 clusters. Of course they were having trouble, the comments say, if you introduce that much complexity. They should only need one single cluster (maybe also a backup and a dev one at most). That's the whole point.

But then here you are saying your team "operates a couple thousand" clusters. If 50 is far too many, and bound to be unmanageable, how is it reasonable to have more than a thousand?

	▲	voidfunc a year ago \| parent \| next [-]
		> But then here you are saying your team "operates a couple thousand" clusters. If 50 is far too many, and bound to be unmanageable, how is it reasonable to have more than a thousand? It's not unmanageable to have a couple thousand Kube clusters but you need to have the resources to build a staff and tool chain to support that, which most companies cannot do. Clusters are how we shard our customer workloads (a workload being say a dozen services and a database, a customer may have many workloads spread across the entire fleet). We put between 100 and 150 workloads per cluster. What this gives us is a relatively small impact area if a single cluster becomes problematic as it only impacts the workloads on it.
	▲	jalk a year ago \| parent \| prev [-]
		Sounds like it's their primary job is to manage clusters for others, which ofc is different from trying to manage your primary service, that you deployed as 50 microservices in individual clusters (didn't read the other article)

▲

kjs3 a year ago | parent | prev | next [-]

Maybe it's because I adopted early and have grown with the technology it all just makes sense?

Survivorship bias?

▲

khafra a year ago | parent | prev [-]

Container orchestration is second nature to us SRE's, so it's easy to forget that the average dev probably only knows the syntax for deployments and one or two service manifests.

And pods, of course

	▲	pdimitar a year ago \| parent [-]
		A factor for sure, but as a programmer I find that the discoverability of stuff in code is much higher than with k8s. Give me access to a repo full of YAML files and I'm truly and completely lost and wouldn't even know where to begin. YAML is simply not the right tool for this job. Sure you got used to it but that's exactly the point: you had to get used to it. It was not intuitive and it did not come naturally to you.