That works for authn in the happy path: short-lived cert, grab it, connect, done.

Except for everything around that:

* user lifecycle (create/remove/rename accounts)

* authz (who gets sudo, what groups, per-host differences)

* cleanup (what happens when someone leaves)

* visibility (what state is this box actually in right now?)

SSH certs don’t really touch any of that. They answer can this key log in right now, not what should exist on this machine.

So in practice, something else ends up managing users, groups, sudoers, home dirs, etc. Now there are two systems that both have to be correct.

On the availability point: "reasonably available" is doing a lot of work ;)

Even with 1-hour certs:

* new sessions depend on the signer

* fleet-wide issues hit everything at once

* incident response gets awkward if the signer is part of the blast radius

The failure mode shifts from a few boxes don't work to nobody can get in anywhere

The pull model just leans the other way:

* nodes converge to desired state

* access continues even if control plane hiccups

* authn and authz live together on the box

Both models can work - it’s more about which failure mode is tolerable to you.

Well, yes, pick your poison.

But for just getting access to role accounts then I find it a lot nicer than distributing public keys around.

And for everything else, a periodic Ansible :-)

	▲	gnufx 10 minutes ago \| parent [-]
		Public keys (for OpenSSH) can be in DNS (VerifyHostKeyDNS) or in, say, LDAP via KnownHostsCommand and AuthorizedKeysCommand.