|
| ▲ | ablob 7 hours ago | parent | next [-] |
| If you're using all services, then any partial outage is essentially a full outage.
Of course, you can massage the numbers to make it look nicer in the way you described but the conservative approach is better for the customers.
If you insist, one could create this metric for selected services only to "better reflect users". That being said, even when looking at the split uptimes, you'd have to do a very skewed weighting to achieve a number with more than one 9. |
| |
| ▲ | verdverm 7 hours ago | parent [-] | | > That being said, even when looking at the split uptimes, you'd have to do a very skewed weighting to achieve a number with more than one 9. It's definitely bad no matter how it you slice the pie. If GH pages is not serving content, my work is not blocked. (I don't use GH pages for anything personally) |
|
|
| ▲ | marcosdumay 7 hours ago | parent | prev | next [-] |
| That's how you count uptime. You system is not up if it keeps failing when the user does some thing. The problem here is the specification of what the system is. It's a bit unfair to call GH a single service, but it's how Microsoft sells it. |
| |
| ▲ | verdverm 7 hours ago | parent [-] | | > That's how you count uptime. It's not how I and many others calculate uptime. There is not uniformity, especially when you look at contracts. |
|
|
| ▲ | bandrami 2 hours ago | parent | prev | next [-] |
| Thinking back to when I was hosting, I think telling a customer "your web server was running fine it's just that the database was down" would not have been received well. |
|
| ▲ | mort96 7 hours ago | parent | prev | next [-] |
| I mean I think it's useful. It answers the question, "what percentage of the time can I rely on every part of GitHub to work correctly?". The answer seems to be roughly 90% of the time. |
| |
| ▲ | verdverm 7 hours ago | parent | next [-] | | I don't use half of the services, the answer is not straight forward https://mrshu.github.io/github-statuses/ | |
| ▲ | naniwaduni 7 hours ago | parent | prev [-] | | Nobody cares about every part of GitHub working correctly. I mean, ok, their SREs are supposed to, but tabling the question of whether that's true: if tomorrow they announced a distributed no-op service with 100% downtime, you should not have the intuition that the overall availability of the platform is now worse. |
|
|
| ▲ | formerly_proven 7 hours ago | parent | prev [-] |
| In a nutshell, why would the consumer care (for the SLO) care about how the vendor sliced the solution into microservices? |
| |
| ▲ | verdverm 7 hours ago | parent [-] | | It will depend on the contract. When I was at IBM, they didn't meet their SLOs for Watson and customers got a refund for that portion of their spend |
|