I don't think aggregating the whole platform into one number is fair. It's like adding the whole aws into one number

On the other hand when you have a reasonably complex deployment it's easy to get swamped with dashboards showing CPU, Memory, I/O, application-metrics, signups, active users/sessions, etc.

Instead it's nice to think about how you can express the state of a complete system as a single number. It might be you divide active user sessions by database-connections, and then scale by memory capacity.

But as a single digit you can then get used to normal ranges, and have it always visible somewhere obvious. A single number won't show details, but when it changes you can go look at the specific metrics. It's a cute shorthand, and it can work well as a basic "are we normal" check.

▲

tensegrist a day ago | parent | prev | next [-]

splitting the status page like they do, to the point where it is only a bit of humourous exaggeration to say that they track broken `git push` and `git pull` separately, is a sleight of hand / accounting / SLA-fudging that we should not excuse

there is a subset of the site that pretty much everyone uses — git, issues, pull requests, actions — and if any part of that is broken then the site is broken and the status page should indicate how often this happens

▲

remus a day ago | parent | next [-]

> splitting the status page like they do, to the point where it is only a bit of humourous exaggeration to say that they track broken `git push` and `git pull` separately, is a sleight of hand / accounting / SLA-fudging that we should not excuse

This is a pretty ungenerous take. You could look at it the other way: if I don't use actions then it's useful for me to know that only actions are broken, and I can continue in my normal usage. If you bundle everything up then the status page is reporting an unhelpful false positive for me.

	▲	tensegrist 11 hours ago \| parent [-]
		you can do both: report a number that shows how often your service as a whole is degraded, with a breakdown for individual components example (not sponsored, i barely use codex and today's the first time i've ever had to look at this page; i don't know how much they're fudging the individual numbers or not reporting minor incidents): https://status.openai.com/ most people who use chatgpt don't use all of the components under the "ChatGPT" heading. for codex, i don't use the vscode extension or codex web. etc

▲

a day ago | parent | prev [-]

[deleted]

▲

bluetidepro a day ago | parent | prev | next [-]

It’s obviously a meme website, the meme is more funny when the number isn’t high. Anyone looking for actual accurate info would go to the real status page.

	▲	einsteinx2 a day ago \| parent [-]
		Ironically I’ve never found official status pages to be all that accurate either since companies love to exclude all kinds of outages from counting towards uptime. Anthropic is hilariously egregious about that as a recent example I can think of, but I assume GitHub does the same since it’s so common in the industry.

▲

jasonvorhe a day ago | parent | prev | next [-]

If S3, EC2, EKS and RDB alone had a similar uptime as all of Github right now, we'd all know.

No one cares that much if repo wikis, commit stats or gist had these issues. It's the combination of inter-dependent services that are used in combination, like PRs, actions, discussions, etc.

If one were to build a single percentage for each of these components of both systems, github would still lose. Maybe it's a few days without outages more but this isn't a comparison.

▲

llbeansandrice a day ago | parent | prev | next [-]

From a user perspective this makes sense. But if you’re MSFT or GitHub this number is pretty embarrassing.

They would love if everyone on the platform used all of the features and had massive lock-in right? So if some part of that is always broken, it’s not a confidence booster for users to adopt more of the feature set.

Sure the more things you use the more likely it is that one has an issue but clearly stability isn’t a goal for these type of companies anymore.

	▲	sharts 21 hours ago \| parent [-]
		Why embarrassing? This is normal MSFT

▲

blinded a day ago | parent | prev | next [-]

Github has far less services and regions that AWS.

▲

8organicbits a day ago | parent | prev [-]

I think the correct middle ground is a site that lets you select the parts of the platform you rely on and ignore the others. For example, GitHub is "down" for me when I can't push, process PRs, or release packages, but I don't care about Actions or AI features.

▲

loloquwowndueo a day ago | parent [-]

You’re kind of an outlier - nobody wants AI but Actions are core for tons of workflows and deployment pipelines. Everyone bought into the “only robots can deploy” mantra (correctly IMO, it’s a huge time and friction saver) only to be bit in the ass by the platform being so u reliable they can be stuck for days without deploys.

	▲	8organicbits a day ago \| parent [-]
		Thats kind of my point, everyone has a different set of GitHub features they rely on. Some people even want the AI bits.