What should we measure?

Flow Metrics show the state of the product, how quickly our work flows through the system, whether it’s likely that features will be completed by a date, and what kind of work is being prioritized.
They are tied to business value and provide us with the feedback we need to make decisions about the next steps for our product.

Perhaps more important, Flow metrics are based on data, not forecasting, and have consistently been shown to be more accurate than forecasting techniques.

The key metrics are:

  • Flow (Work In Progress) Load: How much work is in our system
  • Cycle Time: The average time that stories take to complete
  • Throughput: How much work we typically complete in a defined time (week, sprint, quarter etc.).
  • Flow Distribution: What kind of work is being prioritised (Features, Risk (including Tech Debt), Compliance & Bugs)
  • Flow Efficiency: How much wait time we have in the system
Flow Load: How much work is in our system → How long it will take to get to Done

Why it matters: “The single most important factor that affects wait time is capacity utilization.” – Dominica DeGrandis
If the amount of Work in Progress is greater than your team size (taking pairs & mobs into account), then you won’t be able to predict the delivery date of the work. Look for bottlenecks where work is getting stuck, and focus on creating flow there.

How to see it: You can either simply count the items in progress per week, and chart this over time, or use the Cumulative Flow Diagram that is generated automatically by Azure DevOps and Jira.

Some ways to improve: We want our WIP to be just a bit smaller than the team size (including Pairs). If work is queuing in one area, this is a Bottleneck; get the whole team to look at the best way to unblock items in Wait status.

The most effective way to clear excess WIP is move to a pull-based approach, only taking in the top priority stories and working on those till complete, before picking up new items.

This flow-based approach means more work gets completed at a higher quality.

If our stories per sprint are higher than our throughput, this is usually an indicator of high WIP as well. Even with the best of intentions, teams discover new requirements during the sprint, or have unexpected bugs etc. One way to accommodate unforeseen work as a (historically based) percentage at the start of the sprint; the other is to ensure a level of clarity (eg. using BDD) during the refinement ceremony.

Cycle Time: The average time that stories take to complete → How likely we’ll meet a date

Why it matters: Cycle time uses our historic performance to allow us to forecast average delivery dates of stories. It’s more reliable than velocity, as it is objective data based on actual duration that includes the realities of wait times, dependencies, rework etc. that we tend not to include in our forecasts, rather than being based on guesses about the future which are seldom accurate.

How to see it: Azure DevOps and Jira calculate this automatically, you just need to define the parameters.

  • Define what your start date is: Usually dev teams work from the point that they start backlog refinement; some teams work from when an item is brought into a sprint.
  • Define your end point – this should be ‘ In Production’ but for teams where this doesn’t seem feasible use  LCT Complete  / Tested in PPE etc.

Some ways to improve: The biggest challenge to Cycle Time is unaccounted waiting time.
Wait time increases wherever we have bottlenecks – a Value Stream Map exercise calculating Value Adding time (item being worked on) vs Wait time (giving you Total Time) will show where your longest area of delays are, i.e. bottlenecks to reduce. Both Jira & Azure DevOps can be queried to show when items are being worked on, and the total time.
Also, If you’re not using ‘Production’ as the end date, you may be hiding other bottlenecks in the overall flow that could be improved.

Throughput: How much work we usually complete in a defined time (cadence) → How many items we can expect to be able to do

Why it matters: If the number of stories per sprint is greater than our Throughput, we are creating a bottleneck in the system which will actually slow down our capacity to deliver work, so it decreases productivityinstead of increasing it.
If our sprint stories match our historic Throughput, we should be confident of completing work within the defined time.

How to see it: Throughput is simply an average of the number of items completed in the defined time.  If you use Sprints as the defined time, Azure DevOps & Jira should show this automatically. For other time periods, you can write a quick query to calculate this using the Sprint definition, and show it in a widget.

Some ways to improve: The fastest and most effective way to improve the Throughput: WIP ratio is to move to a PULL based system: only take on work when the system has capacity, don’t allow queues to build up, get all ‘waiting’ work moving and to complete. This increases predictability and allows us to respond effectively to change. You can also reduce queues by reducing bottlenecks; these two often work hand-in-hand. The most common appraoch is simply to increase capacity – while this can have an overall improvement, it is the least predictable.

Flow Distribution: What kind of work is being prioritised → What we should do next

Why it matters: This metric helps us to prioritize upcoming work. Flow Distribution is an indicator about the health / state of the product. It helps us understand what kinds of priorities are currently being focused on, and what we should focus on next to maximise customer happiness: delivering features they need while mitigating risk.

This metric is very context dependent, for example:

  • If only Features are being worked on, you may be ignoring risk or compliance items, which carries it’s own risk – or you may have addressed them early already, in which case you might be moving very smoothly.
  • If there is a large proportion of bugs, you may have tech debt items that need to be prioritized.
  • If most items are Risk / Compliance, you may have a deadline that you have to meet

How to see it: This is a count of the different kinds of work completed during your time period (e.g. a sprint / Quarter), shown as a percentage (100% stack bar is very effective)

For this you need to categorise the work items, and then insert a widget in (both Azure DevOps and Jira have Stack Bars) displaying the data over time.

Typical Categories include Revenue Generation, Revenue Protection and Failure Demand.

  • Features
  • Risk (including Compliance)
  • Tech Debt
  • Bugs

Some ways to improve: Each product will have a different balance these work types at different levels of product maturity, and when facing external demands (new regulations; product competition; technical changes etc.).

Since this is a very context-specific metric, reviewing the Flow Distribution for the last few quarters along with priorites for the next quarter, can introduce healthy conversations about the next work coming up.

Reviewing Flow Distribution during a quarter is a useful way to see whether your aims for the quarter are being met; and adjust near-term priorities accordingly.

Flow Efficiency: How much wait time we have in the system → If there are lots of bottlenecks / obstacles

Why it matters: This metric tells us how much we could improve our delivery speed. And calculating this by co-creating a Value Stream Map, reveals us where best we could focus our attention to improve our effeciency.

The theory of constraints tells us that any system is capable of delivering at the Throughput rate of its weakest link.

How to see it:  Flow Efficiency is Value Added time (the time that an item is being worked on) divided by the Total Time. The average flow time in organizations is 15%; this can go as low as 5% or up to about 40%.

To calculate yours:

  • Do a Value Stream Map exercise showing each step the work goes through, from the customer requesting an item to the customer receiving it, and the average time* spent in each phase.
  • Add up the Value Adding time (the time that an item is being worked on) and Total Time
    * If you have data, use data! Jira & Azure DevOps will give you this detail for time in development, but you may need to estimate on the rest of the work.

LINKS:

This 12 minute video explains Flow Metrics really well: what they tell us, and how we get to them: https://www.youtube.com/watch?v=uBEZoXc4A5w

Slides from Dominica DeGrandis: Agile2019_Dominica_Flow_metrics