In agile methodologies (e.g. SCRUM), the complexity/effort needed for user stories are measured in Story points. Story points are used to calculate how many user stories a team can take in an iteration.
What is the advantage of introducing an abstract concept (story points), where we can just use a concrete measurement, like estimated man-days? We can also calculate velocity, estimate coverage of an iteration, etc. using estimated man-days.
In contrast, story points are harder to use (because the concept is abstract), and also harder to explain to stakeholders. What advantage does it offer?
15
I think one of the main advantages is that humans and developers specifically are actually pretty bad at estimating time. Think of the nature of development too — it’s not some linear progression from start to finish. It’s often “write 90% of the code in 10 minutes and then tear your hair out debugging for 17 hours.” That’s pretty hard to estimate in the clock timing sense.
But using an abstraction takes the focus off of the actual time in hours or days and instead puts the focus on describing the relative expense and complexity of a task as compared to other tasks. Humans/developers are better at that. And then, once you get humming with those point estimates and some actual progress, you can start to look at time more empirically.
I suspect that there is also an observer effect that happens with time estimates that wouldn’t happen with point estimates. For instance, the incentive to sandbag an estimate and deliver way “ahead of schedule” is going to be muted with indirection in a point based system.
8
If you’re using Fibonacci numbers (or something similar), it limits the number of options when estimating a story. I worked with a group that used low numbers only: 1, 2, 3, 5, 8, and 13. We had a reference story that was a 5. This enabled us to easily make snap decisions on a story’s complexity while doing Planning Poker. The other side effect was that anything rated a 13 probably had insufficient information and needed to be broken down further. I seriously doubt it would have been that easy and straightforward if we were using raw hours.
Your Product Owner speaks your stakeholders’ language and should be able to translate between story points and man-hours (or other units) as needed. During my time as a PO, I had some hard data that 1 story point = 4 man-hours, but obviously every team is different.
Edit:
With the knowledge that 1 point = 4 hours, you could theoretically change your Planning Poker deck to 4, 8, 12, 20, 32, and 52. But those numbers feel harder to deal with. I think I would mentally abstract the values back to something simple, e.g., “less than a day”, “more than a week”, etc. And if I’m going to do that, I might as well stick with the abstract unit-less story points.
4
It’s to enable estimation to get better over time, without the estimators all having to adjust their estimation.
Rather than everyone involved in the estimate having to think like “OK.. looks like 2 man days.. but last sprint we underestimated everything, so maybe it’s really 2.5 man days. Or 3?”, they carry on the same as always. “5 story points!”
Then, you adjust your estimation of how many story points the team can get through in a sprint, based on actual measured achievement in previous sprints. “We’ve been doing 90-110 story points per sprint previously!”
I would say the theory behind this is that developers are better at estimating relative complexity of different dev tasks than they are at estimating absolutes. Especially if multiple people are estimating a task which could be done by any one of them (and not everyone works at the same speed as everyone else).
Cynical alternative: I’ve seen it said that developers never come in under time estimates. If something takes longer than was estimated, you’ve gone over. But if something takes less time than estimated, developers may fiddle with it, gold-plate, or just slow down and take it easy since they’ve been given a cushy assignment. Taking the real units of time out of an estimate may curb these tendencies. End cynical alternative.
1
Man days or man hours are as you say concrete. So when a task is estimated at 5 hours and takes 6 it is now a late task.
When you have a story that is a 3 points and it takes 6 hours, it took 6 hours, it’s not late, it just took six hours. The velocity measurement than is more a factor of how many of those points you get done in a sprint, and that number can fluctuate, because it isn’t concrete. You also are not measuring each task, but the total of all the tasks. When you have hours on each task, the temptation is there to measure each task. When that happens, you get no benefit to the sprint for finishing under the time and it is a consequence for finishing over the time of any given task.
It can be a transition to thinking in terms of points. One place I worked before we even introduced agile used t-shirt sizes just to get an idea on the level of effort. Points are just an extension of that.
That isn’t to say there isn’t controversy, or some arbitrary assignment to the points. We have members of our team that almost always vote the lowest number, and complain when they think a task is a 1 and we think it is a 3 that we are suffering from point inflation.
The abstraction is sort of the point. Using the ‘man day’ as a measurement has a number of pitfalls, including:
- If the team isn’t familiar with the tech they are going to be using, then it can be really hard to give real-time estimates of how long a task might take. They are much more likely to be able to give good relative estimates – e.g. “task A will probably take twice as long as task B”.
- Different people work at different rates! If you use ‘man days’ you pretty much have to change the time estimate when a task is passed from one developer to another. Who defines how much work constitutes a ‘man day’ anyway?
If you want to estimate man-days it’s a simple calculation:
user points in story / average user points per developer per day = estimated man days
6
As already mentioned, story points are a relative measure of complexity. One can use power of 2 series (1,2,4,8,16…) or a Fibonacci scale (1,2,3,5,8,13,20…) for estimation. As espoused developers are quite adept at saying something like this:
Feature A is almost twice as hard as Feature B
But it’s really difficult to say ‘how long’ will this feature take for implementation. You let that be balanced by velocity. So if something was estimated as a 5 but turned out to be a 13, a slower velocity would normalize that for the iteration (or you could re-estimate).
Now, there is another alternative, it’s called ‘ideal days’ (some what similar to man-days but I’m not sure if that’s what you meant) and I know of quite a few teams who prefer that. Ideal days are to be interpreted as:
If that’s all what I do after coming to office and take only the necessary breaks, have no interruptions and will have everything I need to ‘implement the story’ i.e. no peripheral activities like meetings, responding to mails etc.,
Mike Cohn, one of the many well know agile evangelists provides the following comparison between story points and ideal days
Story Points
- Helps drive cross-functional behavior i.e. teams estimate stories w.r.t. total implementation complexity all the way from UI to DB and back.
- SP estimates don’t decay i.e. a few months from now a 5 point story is still likely to be 5 points, but an ideal day estimate may change depending on the acquired development skill/speed of that particular programmer
- SP are a pure measure of size i.e. they only and only reflect size w.r.t. complexity. Period. No duration etc., thrown it. That’s the job of velocity. But not so with ideal days. In fact with ideal days there is a tendency to muddle it with calendar days. Keeping it abstract as SPs fights the temptation to compare with reality. Just a measure of size. No nonsense.
- Is typically faster than ideal days. It may be tricky for the first couple of stories, but once you get the hang of it, it’s faster.
- Different developers can have a different take on their ideal day estimate for completing a story. I could do the same in 3 and you could in 5. SPs are more or less uniform across the board. They level the playing field so to speak.
Ideal Days
- Easier to explain outside the team; for obvious reasons 🙂
- Easier to estimate at first as mentioned above. But once you get the hang of SPs it comes naturally
Now, which one to choose is up to the team. However, as most answers here and my personal experience, I prefer story points. Ideal days don’t really have that much of a benefit over SPs (and Mike Cohn also advocates SP along with many other agile evangelists).
4
First, people are better at relative estimates than absolute estimate. The babylonians mapping and rating the relative brightness of stars is a great example. They didn’t get the absolute figures right, but the order was mostly spot on even for very similar intensities.
The second advantage is that a prime reason for doing this exercise is to drive conversation.. If you start discussing in exact days, conversation may quickly derail.
As Napoleon said: the plan is worthless, planning is invaluable.
Third, the project manager does not have to edit all estimates, just because it turns out that estimates were off by a factor of, eg, 130%.
Story points also help you to measure performance improvement of the team over time. In addition, you do not need to re-estimate everything as performance improves.
Take this example which uses man days:
The team estimates different tasks with man-days. It works for a while, but after some time you see that the team is done faster with many tasks than originally thought. So the team re-estimates the tasks. It works for a while, and after some time you see again the same thing: The team is done faster with many tasks again. So you re-estimate again, and this story repeats again, and again and again…
Why? Because the performance of your team increased. But you do not know about it.
The same example with story points:
The team estimates the size of the user stories. After some sprints, you see that the team can do about 60 story points per sprint. Later, you see that the team has achieved more than 60 story points, maybe 70. And the team continues like that and pulls more user stories for next sprints and delivers them.
Why? Because performance has increased. And you can measure it. And you do not need to re-estimate everything after the performance of your team has increased.
1
I’m surprised no one has mentioned Parkinson’s Law yet.
Work expands so as to fill the time available for its completion.
Basically if you’re estimating in any kind of time unit for large tasks, developers will tend to take the time they estimated to complete it or go over. When you estimate in a nebulous time like Story Points or Shirt Sizes you avoid this pitfall.
3
Story Points reflect the complexity of the problem, and therefore, reflect the confidence (or risk) of how accurate the estimate is.
A Story with a high story point tells me that there’s a lot going on with the user story that isn’t concrete.
The idea is to see what’s a good balance of varying story points. If I’m being shown an iteration plan with stories with all high story points, this gives me little confidence that the iteration will be executed as expected and that we need to look at other stories for the iteration or start breaking stories down.
When communicating with a manager or Product Owner, high story points mean that it will be extremely hazy as to when they will get a particular feature. One of the solutions to this is to break the story down and hopefully you’ll have a combination of low and high story points to work with so that you can iteratively demonstrate progress to the Product Owner.
Man days estimate the time it takes to do something. They are best used when the items you are estimating are very precise and measurable. Specific, well known, repeatable tasks are estimable in man days.
For example, if a sales person can make 20 customer calls per day, on average, we can calculate how much time each call takes and from that we can estimate how many man days it will take to make 1000 calls.
In this example one can concretely think in statistical terms about the median length of a call because all calls can be assumed to be effectively the same thing.
Story points determine which combination of stories can be done in an iteration. They are used to combine heterogeneous goals with fuzzy boundaries and to measure how many can be done in a fixed time frame. They estimate the complexity of chunks of work compared to each other so to be able to add them together.
For example, your team developed 5 stories for a total of 23 points in iteration 1, and 8 stories for 20 points in iteration 2. From this you can estimate that in iteration two your team will do a few stories whose total is around 20 points in iteration 3.
Note that we do not need to determine the size of one point and in particular there is no assumption whatsoever that each story of the same size will take the same time to be developed! We only work on sums and on points per iteration. I didn’t even mention how long the iteration is.
If you walk up to a human on the street and ask “How big was a T-rex?” the answers would fluctuate even though majority of humans know what a T-rex is, how big it kind of was, but nobody really knows for certain – because we have NO relative scale to baseline from.
That’s the cognitive behavior you’re trying to figure out with forecasting and many methodologies spin cycles with “I’ve got it!..i have the secret to accurate forecasting!” snake oil to the masses. When you actually do forecast you’re actually saying outloud “I will ALLOW x days/hours/points for that to complete” – its in a sense creating a “timebox” for that event to be carried out within.
For me, Points is just shifting the boundaries, at the end of the day unless you’re in a team that is happy to say “*Well we have 3 weeks per sprint, and thumb suck…i figure we should shoot for 30 points to complete in that cycle! who’s with me!*” and thats as deep as you go in forecast modeling – fine! ..as realistically you’re just setting an arbitary budget and that’s it. You’re also then in retrospective looking at the work completed with a sense of “holy crap, we did 33pts that sprint, that was pretty cool” and not much can be done about that. You can use velocity to determine mid-sprint you’re getting bang for your budget buck by asking outloud “Have we hit 15pts yet? will we” but then your back to adding relative time to the equation aren’t you?” but the danger here is you’re now using Velocity to measure productivity not capacity which from what I understand kicks the Reactive Release Management (story points) in the head..
The point system is almost too clever to not notice that you still attach relative time to the equation, everything from your agreed “sprint cycles” to your daily standups in which you enact some hidden rule around duration + complexity = “Max is taking to long with that task” innate gut feeling team code red moment?
The human brain cant forecast because it involves a lot of working memory mixed with long/short term recall, so its like asking a novice math student to do fractions in their head not on paper.. It’s why other industries never agree on a forecast and constantly validate forecasts in relative time (eg geologist never stop forecast modelling until that cubic meter has been dug out of the ground and then its “done”).
I’d say Point system works if you’re not forecasting. You’re agreeing to a chunk of work that’s based on a sub-chunking algorithm but that’s really your closest approach to forecasting as possible. In fact you’re release management would look for natural breaks in the “backlog” queue that fit around theme(s) (ie in Silverlight we Product managers would wait until after they complete their backlog and piece together the themes we initially set. We never knew what the engineering team were doing specifically we just had a basic outline. We’d then take that body of work and build our marketing event around it (Microsoft Mix))
When you start locking down velocity expectations inside sprint cycles that rely on velocity + time, you’re back to forecasting estimations again only this time you’re worse off because you’re playing the “it depends game” … More importantly you’re also killing potential for team growth / career growth as well.
The tax you pay for Points vs Time is with points you need to look for alternative measurement formulas to track onjob skill development / mentoring or developer behavior.
As in you will still need look at a “median developer” as your ideal person to attach skill/effort with, you can then baseline other developers with that person to determine how they are fairing in their ongoing growth within your team. It also highlights situations where the “fast” developers are carrying most of the water but are getting bored or worse they are working longer hours and no recognition / reward because of competing deadlines etc. Standups don’t unearth this in reality, they are really there to detect bad smells within the team per say, as in “that person is struggling, lets help”
Next also comes the “carry over” stories, stories that don’t get chunked into that sprint cycle but then spill over to the next sprint cycle. Which then can easily create a knock-on effect if you’re factoring in time, but the moment you do factor in relative time..again, you just regressed back to “time based forecasting/estimation” and again the point system is just muddying the waters.
If you go points you have ignore time completely and i mean completely as the moment you let time creep in you’re gaming the idea / methodology.
Having travelled around the world as an Evangelist, I saw a lot of teams swear their hand on whatever they hold dear that they have cracked the Agile Forecast code… but I always clicked my tongue, smiled and walked away with the thought “yeah…you almost did, but that mistress we call ‘time’… she’s just cruel…“
Mike Cohn’s book “Agile Estimating and Planning” describes the advantages and disadvantages of estimating with “ideal days” or story points, so the quick answer to your question is that you don’t have to estimate with story points. If it’s more natural to estimate in ideal days, go right ahead.
1
I think Story Point method has at least two important advantages over Man-day method:
First, it’s easier to estimate in SP. SP is relative and human like us are better in relative than absolute estimation like man-day method.
Second, when you estimate in SP, you get “Team SP” not “Individual Manday”. When you ask “How long this task will take?”, Senior dev can give you 1 day but 5 days for a Junior. That’s Man-day is up to who will take that task to done. If the owner is force to changed (and it will!) your must re-schedule everything. With SP, it still the same whoever take the task.
The story point estimation follows fibonacci series 1,2,3,5,8,13,21…
A human brain can easily map things based on sizes. For example: We have a post it card and assign it a story point 2 and three post it card’s size would mean 2*3=6 story points.
Story Point 6 falls between fibonacci series number 5 and 8 with 5 being the closer number and hence the storypoint would be 5.
1