Remix.run Logo
mountainriver 10 days ago

Love headscale, we just took it to production and it’s been great

linsomniac 9 days ago | parent | next [-]

I've been running headscale for 2.5 years and it's been pretty good. We use our gmail domain for logging in, which gives a big benefit that users can self-serve their devices. Unlike with OpenVPN in the past where ops had to hand off the certs and configs. Really the only downside has been when they accidentally connect to the tailscale login server instead of our own and then can't figure out why they can't reach any services. We use user groups to set up what services users can access.

We are still running the old headscale, because we have some integrations that will need to be ported to the new control plane. According to "headscale node list | wc" we have ~250 nodes, most of them are servers.

One thing I really don't love about tailscale some of the magic it does with the routing tables and adding firewall rules, but it has mostly not been an issue. Tailscale has worked really quite well.

syntaxing 10 days ago | parent | prev [-]

As in you rolled out an internal service for the whole company?!

cassianoleal 9 days ago | parent | next [-]

As opposed to what? This seems pretty normal.

We considered it as well but there was a feature missing that meant we couldn’t use it for one of our main requirements. Had that not been the case, we’d have rolled it out.

mrklol 9 days ago | parent [-]

Mind sharing which feature?

cassianoleal 9 days ago | parent [-]

Honestly I'm hazy on the details but we're running a fairly complex environment in GCP with PSC everywhere, connections to on-prem and other external environments, and something wouldn't quite work due to all that.

Sorry I can't provide any more details but I really don't remember the specifics. We were in touch with Tailscale engineers and they offered some workarounds that we had already worked out but that wouldn't help us achieve what we were after.

sshine 10 days ago | parent | prev | next [-]

I’d love to see a write-up on that.

Especially in the unlikely event that you used Nix for the deployment.

benley 9 days ago | parent [-]

I've done exactly that: headscale in production at work, a few hundred client devices, infrastructure mostly powered by nix. What would you want to hear about it?

squiggleblaz 9 days ago | parent | next [-]

* Does it work well? * Do you recommend it? * Do your users care? * Is it difficult? Do you have to maintain it or is it basically set it and forget it? * What was memorable about setting it up? * Why did you go for Headscale vs Tailscale or Netbird or some other solution?

benley 2 days ago | parent [-]

I posted a reply to another subthread with some of this: https://news.ycombinator.com/item?id=43647368

> * Does it work well?

Very well! There are some limitations (see link above), but what's implemented is reliable.

> * Do you recommend it?

Yes, provided your requirements fit headscale's capabilities. If you need things like device trust attestation (e.g. Kandji MDM or Crowdstrike Falcon integration), SCIM provisioning, or various other enterprise features you may find it inadequate. If you can afford to pay for Tailscale, you should just use Tailscale because it's really good.

> * Do your users care?

They like it way better than our previous OpenVPN setup, that's for sure. I don't think they care about Headscale vs commercial Tailscale - the backend implementation is largely invisible to them.

> * Is it difficult? Do you have to maintain it or is it basically set it and forget it?

Not hard at all to set up, and it requires little maintenance attention. I have barely had to touch the control plane (other than version upgrades) since setting it up a year ago.

> * What was memorable about setting it up?

We had to do some custom coding to have automatic user offboarding when employees leave the company, and to emulate app connectors / dynamic routing (this is now OSS! https://github.com/singlestore-labs/tailscale-manager).

And I've been contributing to the headscale codebase to smooth out some quirks that affected our SSO integration. The headscale authors have been pretty flexible in welcoming outside contributors.

> * Why did you go for Headscale vs Tailscale or Netbird or some other solution?

vs Tailscale: It was way easier to build this myself than to get funding to use the commercial solution. I'm not good navigating corporate politics, but I am pretty good at building infrastructure and writing code.

vs Netbird: Mostly because I already liked Tailscale from using it at home, I like its implementation, and I like the way Tailscale (the company) have behaved. The handful of folks I know who work there are people I deeply respect.

sshine 9 days ago | parent | prev [-]

> headscale in production at work

  - How much effort do you put into key management compared to plain WireGuard?
  - How automated is the onboarding process; do you generate and hand over keys?
  - How do you cope without the commercial Tailscale dashboard?
  - Do you run some kind of dashboard or metrics system?
  - How long did it take to set up?
  - Were there any gotchas?
avtar 9 days ago | parent | next [-]

> How do you cope without the commercial Tailscale dashboard?

There are a couple open source dashboard options but right now only this one comes to mind: https://github.com/tale/headplane

benley 2 days ago | parent [-]

there are a bunch of them: https://headscale.net/stable/ref/integration/web-ui/?h=web

The one I've deployed is https://github.com/gurucomputing/headscale-ui, which is basic but does what I need.

benley 2 days ago | parent | prev [-]

> - How much effort do you put into key management compared to plain WireGuard?

Less effort than plain wireguard; the only key management I do is for non-human clients

> - How automated is the onboarding process; do you generate and hand over keys?

Fully automated. Auth is done via OIDC to my company's SSO provider, so users can enroll their own machines without IT involvement.

> - How do you cope without the commercial Tailscale dashboard?

I don't really miss it. The headscale CLI tool is pretty good, and I use one of the headscale web UI projects (three are several: https://headscale.net/stable/ref/integration/web-ui/?h=web) for quick access to a few features (https://github.com/gurucomputing/headscale-ui)

> - Do you run some kind of dashboard or metrics system?

Yes, I scrape headscale's Prometheus metrics endpoint and have put together a simple Grafana dashboard. The metrics it emits are somewhat limited, but enough to keep an eye on its health.

> - How long did it take to set up?

I had a prototype up and running on Kubernetes with OIDC integration and a web UI in about 1 day of hacking. Going into full production took a few months, but the majority of that time was about planning the migration of all the existing users from OpenVPN.

Come to think of it, maybe I should share my terraform modules for deploying it.

> - Were there any gotchas?

A few, yeah:

- Setting up mobile clients is a bit fiddly, because they hide the "connect to a non-default control plane URL" under a debug menu. The mac and windows apps are similar - it's too easy for users to accidentally try to connect to tailscale.com instead of your headscale instance. If you have the ability to deploy MDM profiles (mac) or windows registry tweaks this is easy to fix, and the headscale server will even generate the configs for you.

- The headscale control plane doesn't support any kind of HA or replication. This doesn't disqualify it since tailscale can handle brief control plane outages without breaking the network, but it's likely to be a concern for serious enterprise users. It's possible to use an external Postgres database, so you can at least replicate data that way, but only one headscale server replica can be active at a time because they don't share runtime state.

- The tailscale API is not fully implemented, so you can't use things like the tailscale Kubernetes operator.

- Some features are missing: tailscale funnel, tailscale serve, app connectors, `autogroup:self` ACLs, SCIM provisioning, SSO group membership sync, and I forget what else. These may or may not be important to you.

For app connectors, I wrote an app to emulate the core functionality: https://github.com/singlestore-labs/tailscale-manager (it's in Haskell, but deployers don't need to care about that)

It's possible to implement group sync with some custom scripting - a python app to scrape your LDAP (or whatever) and generate tailscale ACLs isn't hard to write. But you do have to write it.

`autogroup:self` might be a big deal - you would need this if you want to stop users from seeing or connecting directly to each other's devices. I think there is an implementation of this coming in the next release of headscale.

Summary: headscale is great if you have relatively simple needs and can't afford to pay for Tailscale. You will probably outgrow it if you're running a serious business and need to comply with fancy audit requirements.

mountainriver 3 days ago | parent | prev [-]

All our infra