The Ball Flow Game

I was invited to the Scrum Gathering in Amsterdam this week to give a Deep Dive on Kanban. My Kanban Exploration slides can be downloaded from slideshare. Inspired by an email discussion with Jean Tabaka and Eric Willeke, to introduce the session, and to try and reinforce the concepts of Flow, Value and Capability, I tried a variation of the Ball Point Game that is commonly used in Scrum training.

Here’s a couple of links (Kane Mar) (Declan Whelan) if you’re not familiar with the game. In a nutshell it involves a group working as a team to pass balls between themselves, constrained by some rules. The idea is to pass as many balls in a 2 minute time box. The team has to self organise and inspect and adapt in order to improve its velocity (throughput of balls).

For my variation I wanted to remove the time-box to emphasise flow more, and demonstrate a different way of understanding the capability of a system. In the game, the team are designing a system to meet the purpose of flowing balls quickly between themselves.

The changes I made were to ask the team to pass 20 balls as quickly as possible. I put a unique number (1 – 20) on each ball in case it was useful and also asked the team to time how long it took for each ball to pass through the system.  I took the data that was captured and entered it into a spreadsheet to create a control chart. We ran two rounds of the game twice, with the respective charts below.

Round 1

image

In Round 1, the team didn’t capture all the data, and some problems were had towards the end, but that the average time for each ball was 13 seconds. The system could also be said to be ‘in control’ as all the data points were with the control limits  which were calculated as AVERAGE +/- (3 * STDEV). The last measured ball was completed at 3 minutes and 35 seconds.

Round 2

image

In round 2, the team improved their data capture process and overall flow. The average time per ball dropped to 12 seconds and the variability also reduced. The Upper Control Limit dropped from 01:10 to 00:18. The last measured ball was completed at 2 minutes and 22 seconds.

What this demonstrates is that even with variability (which we don’t want to eliminate completely in software product development), by understanding the capability of the system over time, we are able to reliably communicate what might and might not be possible. For example, using the round 2 data, there is a 50% chance we’ll complete a ball in 12 seconds and a 99% chance we’ll complete a ball in 18 seconds.

We could also calculate and chart the throughput of balls completed over a cadence of 30 seconds to similarly understand the capability from that perspective also. For Round 2 those throughputs would have been 3, 4, 4, 5, 4.

There are a few areas I’d change next time I try this.

  1. The measurement took a long time and was clearly the significant bottleneck. I made measurement part of the system to add some additional complexity, but in hindsight it was probably too much. Most of the improvements were in measuring the system rather than the performance of the system.
  2. I allowed more time than I probably should have for improvement discussions. With the time-boxed version its easier to start the clock for a round and that usually that kicks the team into action. Similarly, when the measurement fell apart we stopped and restarted a couple of times. I wouldn’t do that next time, although by removing measurement from the system, it might be less of a problem.
  3. It took time to enter the data into the spreadsheet. I need to find a better way! The spreadsheet can be found here. It’s very simplistic. Please let me know if you use it and improve it!

7 Comments

  1. Karl,
    Here is a thought. It is true that measurement was the bottleneck. However, I noticed that people were trying to optimize for the measurement not for the real thruput of value. One time, there was an excellent idea to improve the thruput significantly. However, they couldn’t do it because it wouldn’t let them measure effectively.

    So I thought it was teaching a lesson that some times measurement becomes a burden and may be the team need to rethink the way we measure.

    Manoj

  2. Karl,

    The way you compute the UCL and LCL in the exel is exactly how Don Wheeler says that you should not do it (if I understand it correctly). Maybe worth looking up in “Understanding Statistical Process Control” by Don Wheeler. I think we need to set the good example.

    Patrick.

    1. Thanks for the feedback Patrick. That’s a book I need to read!

    2. Karl’s approach would be correct if he knows the
      /population/ mean and standard deviation, which I think he does in
      the given example above: Each ball is timed — it’s not a sample.
      The other formulas for UCL and LCL come into play when working with
      samples rather than the full population. In most situations we
      don’t know the full population and thus don’t know the population’s
      mean or standard deviation. Let’s say that you take n=5 samples at
      the top of each hour of a long running process. Compute a mean for
      each set of samples (to get a set of xbars). Estimate the mean
      using the mean of the xbars (which we’ll call xdoublebar — the
      mean of the sample means). Estimate the standard deviation for the
      population as the standard deviation of all the samples (and call
      that s). Then, the UCL and LCL are 3s/squareroot(n) above and below
      the center line. You should have at least k=25 xbars to use this
      approach.

  3. BTW, I started a collection of tips on Kanban-mechanics at http://t.co/tJevzfO

  4. Andrew,

    Thanks for pointing that out. What would be the advantage/disadvantage of working with moving averages rather than individual datapoints to your knowledge?

    Patrick.

    1. Hi Patrick. Not sure how to answer that. I don’t know how you’d come up with UCL/LCL if all you had were moving averages. If you have the full data, use it; the statistics will be more accurate using population data if you can get it rather than using samples. Perhaps I’m answering the wrong question?

Comments are closed.