Remix.run Logo
palcu a day ago

Thank you! Opening an incident as soon as user impact begins is one of those instincts you develop after handling major incidents for years as an SRE at Google, and now at Anthropic.

I was also fortunate to be using Claude at that exact moment (for personal reasons), which meant I could immediately see the severity of the outage.

koakuma-chan a day ago | parent | next [-]

It's important for companies to use their own products.

awesome_dude a day ago | parent [-]

Unless using your own dogfood prevents you from fixing it if it breaks

https://www.theguardian.com/technology/2021/oct/05/facebook-...

I have a memory that Slack fell into this trap too (I could be wrong)

hiddencost a day ago | parent | next [-]

Facebook notoriously had to cut open the doors to one of their data centers.

Google SRE still keeps IRC available in case of an emergency.

antonvs a day ago | parent | prev [-]

Now I’m imagining the folks at Slack gritting their teeth and using MS Teams

aduwah a day ago | parent | prev | next [-]

Take my condolences, Sunday outages are rough

a day ago | parent | prev | next [-]
[deleted]
nrhrjrjrjtntbt a day ago | parent | prev [-]

Sweet. Hopefully it is more than instinct but a codified at Anthropic. I.e. a graduate engineer with little experience can assess and raise incident if needed.