Barry O’Reilly presents the problem as trying to determine what makes a good architect. Why do some architects succeed in conditions of high uncertainty yet others fail to find similar success?
Engineers build ordered, structured, simple software and then put that software in a dynamic, unstructured environment. Barry defines
Stressors as events that unpredictably emerge and put stress on the system. A good architecture is resilient in the face of
He then moves into a discussion on Complexity Science. The two key ideas are:
- Random Simulation – As
engineerspeople, we’re not very random so our requirements gathering isn’t very representative of the entire space. Our “randomness” results in a small cluster of points, which give an incorrect view of what can happen in the space.
- Kauffman Networks and NKP Analysis – Systems with a large number of nodes can arrive in a huge number of states. If nodes are connected, the number of states drops dramatically. Connected nodes are called
The last key term is
Residue. When a
Stressor occurs, the architecture changes to support the stress. The part of the architecture that survives the change is the
Residue. The set of multiple
Residues combined into a single architecture can then stand up to several
Good architects design around
Attractors because that focal point will handle many
Stressors. Junior designers will focus on happy paths, which only handle a few
Stressors. He suggests using extreme
Stressors (e.g. Godzilla crashing our city) to help find
Attractors that we’d potentially miss.
Barry then provides an example of designing a car charger network, along with
Residues. One particularly interesting outcome is how handling a few early
Stressors results in an architecture that stands up to future
He finishes the talk by going into further usage, explanation, and results. This a very full talk, so it’s worth watching if you want more details.
Eli Goldratt and Inherent Simplicity
The talk reminds me of Eli Goldratt’s work on “Inherent simplicity”, which says that the more complicated a system, the more connections and thus fewer independent processes.
Inherent simplicity and
Attractors seem to be describing the same concept: in a complicated system, seemingly unrelated leaf nodes will be causally connected to the same root node.
- I’m not sure that this is superior to considering
Stressorsfrom the bottom up. In his car charger example, “the network fails” would be a hopefully obvious failure mode. Following that failure would lead to the same path of possible solutions. I think that using both analysis styles together would be powerful rather than one over the other.
- Some of the
Stressorsfor the car charger network aren’t really addressed. For example, “4. Electric car market fails” has the mitigation of “convert to petrol stations”. It’s very unlikely that the naive architecture handles that case, so you need a
Residual. But is that a likely enough business case to spend time on? That’s where risk management and cost/probability come into play. “Asteroid hits earth and everyone dies” is a
Stressor, but is it worth building a residual to handle it?
Overall, this seems like a very good perspective on designing resilient and reliable systems. The main takeaway for me is to use more extreme
Stressors to test the system. If I hadn’t already spent significant time thinking about inherent simplicity and therefore was not surprised by the concept of
Attractors, this talk would probably be even better and more challenging.