Docker Hub is down (again)

Is this why Home Assistant on my Raspberry Pi cannot install the Matter server update from 8.1.0 to 8.1.1 that it is telling me is available?

It gives this error:

> Error during service call to update.install: Error updating Matter Server: Can't install homeassistant/aarch64-addon-matter-server:8.1.1: 401 Client Error for http+docker://localhost/v1.51/images/create?tag=8.1.1&fromImage=homeassistant%2Faarch64-addon-matter-server&platform=linux%2Farm64: Unauthorized ("unauthorized: authentication required")

From the "localhost" in the URL I assumed it was an error with a local Docker instance but I have no idea how HA actually works under the hood. I used the install method where you use the Raspberry Pi Imager to make a bootable HA RPi image and that takes complete control of the RPi. There's a Linux in there, but I've got no login on it. It is a complete black box to me with all my interaction through their web interface or the mobile app. Presumably it has to get 8.1.1 of the Matter server from somewhere, and if that is failing maybe it makes the localhost Docker fail too?

	▲	pkage 15 hours ago \| parent [-]
		Yes, that's an http connection to the Docker Engine api on localhost failing due to the same issue—the docker engine cam't negotiate with the Docker Hub to get the new image and is passing the error back through the local api to your updater process.

▲

Havoc 16 hours ago | parent | prev | next [-]

Thank goodness we don’t basically have a monoculture…right guys?

	▲	yawaramin 13 hours ago \| parent \| next [-]
		Thank goodness we all set up pull-through caches of Docker Hub and use the full sha of the images we use, so that the cache doesn't need to query Docker Hub for metadata updates...right guys?
	▲	johnisgood 14 hours ago \| parent \| prev [-]
		We will never learn. I want GitHub to go down for a few days. :D

▲

leakycap 16 hours ago | parent | prev | next [-]

Crazy that we're 1 hour in and even basic authentication is still down... and no updates?!

▲

cranberryturkey 16 hours ago | parent [-]

i wonder why they can't rollback

▲

whalesalad 16 hours ago | parent [-]

probably because docker hub is down

	▲	XCSme 16 hours ago \| parent [-]
		Lol, that would actually be funny, they can't restart it because it would require pulling the image from itself.

▲

leakycap 14 hours ago | parent | prev | next [-]

UPDATE: It's up. But the link that previously was a status page now redirects to something else.

Details on this incident: https://www.dockerstatus.com/pages/history/533c6539221ae15e3...

Unacceptable level of communication during critical downtime; I know no one who was able to access or use Docker but it is still listed in the history as partial service degradation

▲

selcuka 13 hours ago | parent | prev | next [-]

Docker Registry Uptime is 100% [1]. I wonder what should have happened to record a downtime?

[1] https://www.dockerstatus.com/

▲

thehamkercat 17 hours ago | parent | prev | next [-]

It's funny that the status page is all green, and says "All Systems Operational"

▲

XCSme 16 hours ago | parent [-]

Some yellow: https://www.dockerstatus.com/

▲

thehamkercat 16 hours ago | parent [-]

Yeah, they turned yellow after around 15-30 minutes of the incident

I wonder why isn't it automated

▲

eddythompson80 16 hours ago | parent | next [-]

Status pages stopped being automated a long time ago because they are bad PR.

Often you’d have dozens if not hundreds of services on a status page. If you have a major networking outage for example, then everything is technically down. Someone screen shots the sea of red that your automated status page is showing and tweets “lol everything is down at [insert company]. Then you get a million imverysmart people posting about single point of failure or whatever.

As a result status pages, in every place I know, require a human to actually declare the outage there. Internal ones are usually automated, but if your service is down due to dependency on another service, you don’t mark yourself as down.

Also most places I know of have moved away from public status alerts anyway. You get a customized alert in your account or email if you happen to be impacted by a particular outage. The public ones are for the very very _very_ bad outages.

	▲	andrewfong 15 hours ago \| parent [-]
		My understanding is that it's also a legal CYA. If you have SLAs in place, outages might mean you owe money. So companies tend to err on the side of underreporting.

▲

jbmchuck 16 hours ago | parent | prev | next [-]

My guess/experience - because there are probably layers of management and executives who have an uptime # in their OKRs or whatever is fashionable these days.

The decision to post anything about outages comes from the executive chain in many orgs lest they miss out on bonus compensation for the year.

This is the same reason services like docker and aws will very rarely call an outage an 'outage' - it's always 'service degradation', even when dockerhub is completely useless as it is right now.

▲

XCSme 16 hours ago | parent | prev [-]

I am surprised that they are "working on a fix" for more than 2 hours now, given the scope of the problem.

▲

frabonacci 16 hours ago | parent [-]

Also this comes just a couple of days after a similar incident affected all of Spain

	▲	yomismoaqui 15 hours ago \| parent [-]
		Are you refering to the blocking of Cloudflare when La Liga matches are played? That affects sites that use Cloudflare, but it's not the fault of Dockerhub.

▲

15 hours ago | parent | prev | next [-]

[deleted]

▲

dcbad 15 hours ago | parent | prev [-]

got an update: [Identified] We are continuing to work on implementing a fix. We will update as the status evolves.

....