Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've seen several companies where analysis like this would be for management only. I guess it's just human nature to want to sweep mistakes and accidents under the rug, but it does also speak volumes about the culture in such companies. Kudos to Microsoft and every other big player that communicates these things.


It reminds me of the NTSB's crash investigations. Instead of looking for a scapegoat or someone to blame, they look for the cause, and then look even deeper to find the root cause.

For example they discover a pilot made a mistake. But they don't end it there, they then look at the airline's training materials, see if other pilots would repeat the same mistakes, and so on until they reach a point where they have a "this won't happen again" resolution (rather than simply discovering what happened).

I feel like with Microsoft's breakdown they did the "this is what happened" post-mortem but then went to the next level and said "here's why this happened, and here is why it won't happen again."


Nitpick: Crash investigations are done by the NTSB, not the FAA.

The NTSB has no authority to enforce its recommendations. That's up to the FAA. The idea behind that is the NTSB is more likely to be impartial.


Valid correction. I've edited it in. But it did originally say FAA, not NTSB.


>I've seen several companies where analysis like this would be for management only

I've found it to be pretty standard in the hosting world. I assume because if you have unexplained outages, customers leave.


especially when it's an outage for 5-11 hours (depending on the customer) as this one was.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: