Failure Is Not An Option

FAILBefore reading this post, I encourage you to have a go at this quick puzzle to test your problem solving.

Go on, you won’t regret it!

How did you do?

I’ve used this challenge numerous times in various talks and workshops and I find that the majority of people follow the same pattern. They only test their hypotheses by guessing sequences which they think will succeed in following the rule. Even when I’ve occasionally primed them by talking about the need for failure. Its only after repeated guesses that someone will eventually propose a sequence which does fail, and the “oohs” and “aahs” noticeably indicate learning.

This problem is a nice example to show that we learn more when we fail because we generate new information. Information Theory suggests maximum information is generated when the probability of failure is 50%. If we never fail then we must know everything already, and if we fail all the time then we must be repeating the same mistakes over and over again.

Of course, that doesn’t necessarily mean we always want to fail 50% of the time. We wouldn’t want any plane we fly in to have a 50% probably of crashing, but then we’re not really wanting to generate new information and learning when we fly. Context is important. However, when we are developing new products, we do want to learn, so a 50% failure rate is more appropriate. Failure is not an option, it’s a necessity!

That’s easier said than done though, so here’s some things I have learnt which may help.

Run Experiments

Firstly, to be open to failure we need to consider the assumptions we have made in taking decisions on what to do. We should treat those assumptions as hypotheses, and come up with ways to test them. A simple template which can be useful as training wheels is this one:

  • We believe that <solution>
  • Will result in <outcome>
  • We will know we have succeeded when <evidence>

Having said that, the puzzle this post opened with showed how we need our experiments to fail as well as succeed, so we also need to intentionally falsify our hypotheses. As Karl Popper wrote in The Poverty if Historicism: 

The discovery of instances which confirm a theory means very little if we have not tried, and failed, to discover refutations. For if we are uncritical we shall always find what we want: we shall look for, and find, confirmations, and we shall look away from, and not see, whatever might be dangerous to our pet theories. In this way it is only too easy to obtain . . . overwhelming evidence in favour of a theory which, if approached critically, would have been refuted.

Thus the evidence we look for should show that not only have we proved our hypotheses, but also that we have also failed to disprove them. This means moving from a fail-safe approach where we assume a low probability of failure, to a safe-to-fail approach where are can quickly and cheaply recover from failure without putting lives, careers or reputations at risk.

Illuminate Feedback

We need to run experiments because we are generally dealing with complex problems, where cause and effect are not repeatable and we can only explain results with retrospective coherence. We cannot rely on experts to prescribe good practice to follow and need to rely on emergent practice as a result of experimentation.

The consequence is that experts disagree, a tell-tail sign of a complex problem. Recognising a problem as complex, with no knowable solution, is the first step in breaking away from arguing over who is right it wrong. Instead we can use Chris Argyris‘ Ladder of Inference to explore why we have differing opinions and how we might reconcile views and achieve mutual learning to allow solutions to emerge.

The ladder has the following rungs:

  • Take Actions
  • Adopt Beliefs
  • Draw Conclusions
  • Make Assumptions
  • Interpret Meaning
  • Select Reality
  • Observe Reality

We naturally tend to climb to the top of the ladder, quickly taking action without considering the beliefs, conclusions, assumptions, meaning and reality that has led to the action. By climbing down the ladder we can understand both why we take certain actions, and why others take different actions. It is probably due to different beliefs, conclusions, assumptions, interpretation or selection of data.

This known as Assertive Inquiry, where we advocate for our view, while at the same time seeking to understand an alternate view. Discovering why we might recommend different action this way generates understanding which can lead to potential experiments to test various views.

We can think if this as shining a light on a problem. If we try and solve problems in the dark, we don’t see the information and feedback from which we can learn. I like the analogy of practising a golf swing in the dark. We won’t see where the golf ball ends up, so we can’t adjust accordingly.

This is similar to the Streetlight Effect; searching for something and looking only where it is easiest. Its like the “joke” about the drunk who has lost his car keys, and is looking for them under the lamppost. When asked why he is looking there, his response is that he won’t find his keys where there is no light.

Expect the Unexpected

Running experiments and searching for feedback in this way means intentionally widening the scanning range so that we are more likely to pick up on information that we might naturally ignore due to our inherent biases.

First we need to overcome the God Complex; an unshakable belief characterised by consistently inflated feelings of personal ability, privilege, or infallibility. In my last post, In the Lap of the Gods, I talked about how this can lead us to not even acknowledge the need to run experiments or search for feedback in the first place. It’s why disagreement can be healthy and why we need to create safe environments where people who can openly challenge our thinking.

Then there is Cognitive Dissonance; the inner tension we feel when our beliefs are challenged by evidence. Whenever I walk down a stationary escalator, something seems wrong. My brain is expecting movement but my body doesn’t experience any. It really should be just like walking down stairs, but it does’t feel that way. My belief is that the escalator is moving, but the evidence is that it is not, and our natural inclination is to believe our beliefs and ignore the evidence. Thus we dismiss any contrary feedback we receive as being wrong.

Related to this is Confirmation Bias; the tendency to favour information that confirms one’s pre-existing beliefs or hypotheses. This is what generally leads us to only try and prove our hypotheses, but also means that we only notice feedback which does prove our hypotheses. Again any contrary feedback we receive is ignored.

The situation is further complicated by Survivorship Bias; concentrating on the people or things that made it past some selection process and overlooking those that did not. A great example of this is the story of Abraham Wald, a statistician  during World War 2. We was tasked with prioritising where to reinforce planes with better armour to increase survival rates given that the planes’ weight limited the amount of armour possible. Available data from surviving planes showed the following patterns of damage and common theory was to reinforce those most damaged areas.

However, Wald’s insight was that this damage was from planes which had returned, and therefore could survive damage in those areas. As a result, it was likely that planes which did not survive would probably have been hit in the undamaged areas, and this is where any reinforcement should go. Thus we need to pay attention not just to the information that we can see from our experiments, but also consider any information that we don’t see from failed experiments.

And then there is Availability Bias; relying on immediate examples that come to mind when evaluating a specific topic, concept, method or decision. An example is whenever someone mentions a new model of car to us, and suddenly we see that model everywhere. Its not that its suddenly appearing more often, its just that our brain notices it more often because its more recently available for recall. Our preferred hypotheses will be more available, so we are more likely to notice feedback which relates to them.

So when considering a hypotheses its easy to notice information which is more immediately available, which survives experiments, and which confirms our opinions, formed from the belief that we are experts. And this is just a handful of the huge list of cognitive biases on Wikipedia!


There’s a nice acronym which suggests that a FAIL is a First Attempt In Learning. I use that to highlight that failure is not something that we should shy away from and treat as an enemy, but something we should embrace and befriend. That doesn’t mean encouraging and celebrating failure though. Too much failure is as bad as not enough.

What’s needed is Critical Thinking; the intellectually disciplined process of actively and skilfully conceptualising, applying, analysing, synthesising, and evaluating information to reach an answer or conclusion. The Backbriefing & Experiment A3s I use are intended to encourage this by helping focus on the three areas in this post –  running experiments, illuminating failure, and expecting the unexpected. 

To close, I would recommend Black Box Thinking by Matthew Syed to explore these ideas in more depth. The metaphor comes from the aviation industry, and in particular comparing it to the healthcare industry. Syed cites a 2013 study published in the healthcare Journal of Patient Safety which put the number of premature deaths associated with preventable harm at more than 400,000 per year. That is the equivalent of two 747 airliners crashing every day. His compelling argument is that the aviation industry, where you really don’t want failures, is actually extremely safe because of the way it uses Black Boxes to conscientiously learn from any failures when they do occur. The healthcare industry, on the the hand, has a history of brushing failures under the carpet as inevitable and just the nature of the job.

We would do well to take note, and be more like the aviation industry, applying the same attitude and discipline so that we can befriend and learn from failure.

The Science of Kanban – Conclusions

This is the final part of a write-up of a talk I gave at a number of conferences last year. The previous post was about the science of economics

Scientific Management Revisited

Is scientific management still relevant for product development then? As I have already said, I believe it is, with the following clarifications. I am making a distinction between scientific management and Taylorism. Whereas scientific management is the general application of scientific approach to improving processes, Taylorism was his specific application to the manufacturing domain. Further, in more complex domains such as software and systems development, a key difference in application is that the workers, rather than the managers, should be the scientists, being closer to the details of the work.

Run Experiments

The used of a scientific approach in a complex domain requires running lots of experiments. The most well-known version is PDCA (“Plan, Do, Check, Act”) popularised by Deming and originally described by Shewhart. Another variation is “Check, Plan, Do”, promoted by John Seddon as more applicable to knowledge work because an understanding of the current situation is a better starting point, and Act is redundant because experiments are not run in isolation. John Boyd’s OODA loop takes the idea further by focussing even more on the present, and less on the past. Finally, Dave Snowden suggests “Safe To Fail” experiments as ways of probing a complex situation to understand how to evolve.

Whichever form of experiment is run, it is important to be able to measure the results, or impact, in order to know whether to continue and amplify the changes, or cease and dampen them. The key to a successful experiment is whether it completes and provides learning, not whether the results are the ones that were anticipated.

Start with Why

Knowing whether the results of an experiment are desirable means knowing what the desired impact, or outcome might be. One model to understand this is the Golden Circle, by Simon Sinek. The Golden Circle suggests starting with WHY you want to do something, then understanding HOW to go about achieving, and then deciding WHAT to do.


Axes of Improvement

One set of generalisations about WHY to implement Kanban, which can inform experiments and provide a basis for scientific management is the following:

  • Productivity – how much value for money is being generated
  • Predictability – how reliable are forecasts
  • Responsiveness – how quickly can requests be delivered
  • Quality – how good is the work
  • Customer Satisfaction – how happy are customers
  • Employee Satisfaction – how happy are employees

The common theme across these measures is that they relate to outcome or impact, rather than output or activity. Science helps inform how we might influence these measures, and what levers we might adjust in order to do so.


In these posts I have described Kanban in terms of the sciences of people, process and economics. However, this can actually be generalised to describe Lean as applied to knowledge work, as opposed to the traditional definition of Toyata’s manufacturing principles. The differentiation is also a close match back to my original Kanban, Flow and Cadence triad.

  • Kanban maps to process, with the emphasis on eliminating delays and creating flow rather than eliminating waste.
  • Flow maps to economics, with the emphasis on maximising customer value rather than reducing cost.
  • Cadence loosely maps people and their capability, with the emphasis on investing in those who use the tools rather than the tools themselves.


The ideas in this article have been inspired by the following references:

The Flow Experiment

I put together a small simulation for the SPA Conference this year which seemed to go well, and which I re-ran at the London Limited WIP Society, and hope to run again. You can download the materials, and this is a short write-up of how it works so people can run it and experiment with it themselves.


The basic aim of the simulation is to solve maths problems. This idea was inspired by Simon Bennett and Mark Summers session The Incentive Trap which also uses maths as the problem domain. The solving of equations introduces variability into the exercise using some simple knowledge work which is hopefully more interesting and engaging than rolling dice.

The maths problems flow through the following value stream:

  • Options
  • Analysis
  • Solve
  • Check
  • Accepts
  • Done

The following roles are involved in the value stream:

  • Analyst
  • Solver
  • Checker
  • Accepter
  • Manager

The following scenarios are used to experiment with the flow:

  • Phase Driven
  • Time Boxed
  • Flow Based



Each scenario starts with a portfolio of possible problems to solve, in the following format:

ID Operands Solution
1 3 25

In this example  we have an option to create an equation with 3 operands and a solution of 25.


When an option is selected, it is transformed into an equation during analysis. Rather than expecting participants to come up with their own equations, which could result in trivial equations, a lookup is provided.  The equations in the lookup are in a different order to those in the portfolio so some effort is required!

Operands Solution Equation
3 25 3 * 7 + 4


The equations are then solved independently i.e. the solution is not available


In order to check that the Solve stage produces a correct result, the equation is solved independently again.


Finally the two independent solutions are compared, along with the actual equation, to ensure it has been solved correctly

ID Operands Solution Equation
1 3 25 3 * 7 + 4


When the correct equation has been independently solved correctly twice, then the problem can be considered Done.



The analyst selects the options from the portfolio, matches them against the available equations, and writes them onto index cards. Each index card should contain the option ID and the equation as follows:



The solver takes each index card with an equation on it, and solves it. Any intermediate calculations should be written on a separate sheet, and calculators should not be used (although someone who did use a calculator at SPA didn’t seem to gain any advantage!) The answer is to be written on the back of the back of the index card, to the left side, and covered with a small post-it so that is hidden and can’t be copied.



The checker also takes each index card with an equation on it, and solves it. Again, any intermediate calculations should be written on a separate sheet, and calculators should not be used. this time, the answer is to be written on the back of the back of the index card, to the right side, and again covered with a small post-it so that is hidden.



The accepter takes the index card and confirms whether the ID and equation match correctly, and that the two answers are both the same and correct. The they are, the the problem is Done, otherwise they reject it. Each scenario will handle rejection differently.


The managers job is to keep time, ensure the process is being followed and capture metrics. Every 30 seconds they should count how many of the maths problems are in each stage of the value stream and record it on a worksheet. It is these numbers which can be fed into a spreadsheet to generate a Cumulative Flow Diagram to visualise the flow.



Each scenario is 5 minutes each.

Phase Driven

For a phase driven approach, the team should initially plan how many of the set of options they think that they can complete in the 5 minutes available. Then all the selected options are worked on phase by phase. Thus they are all analysed, then all handed over to be solved, then all handed over to be checked, and finally all handed over to be accepted. Any rejected work can only be moved back to the beginning once everything else has been accepted as Done.

Time Boxed

For the time boxed approach, the team should plan how many of the set of options they think that they can complete in the 1st of the 5 minutes. Those options are then worked on by the team individually. Specialism still applies, but once a problem has been analysed, it can move to be solved, check and analysed without waiting for the whole batch. At the end of the 1 minute time-box, the team should stop, review and re-plan the next minute, deciding how many problems to work on next. This is repeated until the 5 minutes are up i.e. there are 5 x 1 minutes time boxes. Any rejected work can be passed back immediately.

Flow Based

For the flow based approach, the team should pick 1 problem at a time to solve. As with the time boxed scenario, specialism still applies, so once a problem has been analysed, it can move to be solved, check and analysed. However, there should only be one problem in each stage of the value stream at a time, thus creating a pull system. Any rejected work can be passed back immediately (which may result in the WIP limits being broken), or the accepter can pull in the appropriate role to resolve the issue.


The metrics from the managers worksheets can be fed into an excel spreadsheet (included in the download package) to generate CFD diagrams. Here are 3 from one of the teams at SPA.

Phase Driven


Time Boxed


Flow Based



There are a number of variations I’d like to try.

  • One of the things I’ve noticed is that the maths problems may be just a little bit too difficult for some teams, and the take too long sometimes to get any really useful results. One option would be to extend the time for each scenario to 10 minutes to allow more time. I wonder whether this could make it less snappy though.
  • The time-boxed scenario never really plays out how I envisaged it. This is partly down to the short time frames. Stopping, reviewing and replanning every minute doesn’t seem right – especially when you can only manage 1 problem in a minute! What i was trying to show was the small-batching nature a time-box can have. One way round this is to explicitly create the batches in a similar way to the Penny Game.
  • Some people don’t like the mental exercise involved in the maths! Katherine Kirk described a variation to me where the teams used a “Pictionary” workflow instead. Options –> Describing –> Drawing –> Guessing –> Checking –> Done
  • Its quite likely that the Flow scenario comes out “best” because its the last one. It would be interesting to run the scenarios in different orders to see what impact that had. Especially if there are 3 or more teams so that each team can start with a different scenario. This would possibly be more complicated to run, but with enough facilitation could be done.

Feel free to download the pack, which contains:

  • Handouts – PDFs of the options, analysis and accepter worksheets for each scenario
  • Spreadsheets – one with all the details used to create the worksheets, and one to be used to create the CFDs
  • Powerpoint – slides with simple instructions for running the experiment

All I ask is that you let me know how you got on, and what variations you come up with. Here are the SPA results and LWS results.