Our Estimates are Terrible!

I’ve had a number of conversations with teams recently where people start by saying that they have a problem with their estimates, that the estimates are terrible and causing all kinds of planning problems. But do we really have a problem?

One team were concerned enough that they wanted to have a discussion on their estimates and the problems they perceived in their process. In preparation, the Scrum Master prepared a data-set of estimates and resultant actual data so we could base the conversation on data. The data collected included a story-by-story view of the original point estimate provided by the team and the subsequent actual story points recorded by the team on completion of the work. As a result we were able to create the following chart:

This is a frequency chart, a useful starting point for this kind of analysis. It shows for each of estimates the team made (bottom set of numbers – 0.5, 1, 2, etc) the number of times the actual result was a particular estimate. So for estimate of “3” there was 24 times that the actual was also 3 (tallest spike on chart), 15 times it was a 2 (to the left of the tallest spike), 11 times it was an 8 (to the right of the tallest spike) and so on.

Looking at the chart you can see that most of the time when the team said it was a number it turned out to be that number. Further you can see there is a grouping around that number as well. This is good news. We cannot expect that when we estimate something we get it 100% right every time. Sometimes the actual work with be higher, sometimes lower. An estimate is not a commitment, after all. But you can see the data shows that the estimates are pretty good, close, overall.

The chart also shows something else. You can see that the distribution of the results is not a normal distribution, that the “actual” results are skewed to the right. Again this is as it should be. Distributions of results from estimates are not normal. It is easy to understand why. If I think something is about a 5 sized piece of work, the worst I could be wrong on the low-side is that it turns out to take no effort at all (ie actual is 0). On the other side of the curve the 5 will sometimes turn into a 20, a 40 and even in some cases, a 100. It won’t happen a lot (you can see that from the frequency charts) but it will probably happen. Estimates in fact conform more to a log normal distribution rather than a normal distribution. What this means is that in general is that estimates are low, not because of bias in estimating, but because of the nature of estimating process. We combat this effect by using velocity to help understand what is really possible for the team to produce and by using ranges of velocity values when trying to understand what will happen in the future. But note there is a probability for a “worst case”; there is a chance for “black swan events” that will seriously effect the overall plan.

So what does your chart look like? Do you have a problem with your estimates? Do your 5’s line up around the 5, or do you find a lot of 5’s seem to turn into 13’s for example. If they mostly line up, your estimation approach seems to be working. If you have a different result then perhaps additional analysis is required. Are there differences based on new feature work versus defects? Is there a consistent bias? Etc.

In the case of this team, after the estimation discussion was completed (there were other things that needed to be discussed in addition to the issue of accuracy), the conversation turned to the original driver for the conversation – improvements in planning – which ended up with a discussion on making and keeping commitments.

Note: This blog post was turned into a paper and presentation and presented at the Agile 2014 Conference:

Note: this posting was originally published in August 2013 on a Wordpress Blog

Estimates, BlogEntry, PresentationIdea