Server room on fire

In every field of knowledge, there are some things which once expressed will rarely be changed, save perhaps to be put in new wording for a new age. Their core remains untouched because everyone recognises that truth.

There is a document that was written about systems, by a safety researcher, some twenty years ago. I don’t know if its author gave much thought to complex software systems, but he nevertheless wrote a number of core truths, and every IT operations manager should consider this document a must-read.

It talks about complexity and how humans interact with that complexity — and on the topic of humans, the author is painfully accurate.

In my years in IT operations I frequently passed this document to others around me, and it was pleasing to experience some years where more and more frequently, people had already seen it. But that came and went, so I am reminded to keep showing it to people, as it was written decades ago.

Some genius went and made a single-page site out of it, making it all the easier to find, so I will simply provide the link, and the recommendation to read it. It’s a 10-minute read (take your time), but its lessons will stay with you for a lifetime.

Here it is: How Complex Systems Fail by Dr Richard Cook .

And here is a recording of Dr Cook talking at Velocity 2012 on “How Complex Systems Fail”.