Waste Deep in Being Done – Or…Why it’s Shorter to Take Longer – Guest Post by Benjamin Spector

Estimated Reading Time: 5 minutes

Introduction

I met Benjamin Spector in one of the recent Agile Boston meetings. He told me a story that I liked a lot since it brought to life one of the key concepts I presented – The Transaction Cost vs Cost Of Delay curve (from Principles of Product Development Flow by Reinertsen). I was able to persuade Benjamin to write up this story…

Waste Deep in Being Done – Or…Why it’s Shorter to Take Longer 

“We should have finished a month ago.”  That was the item with the most votes during the team’s last sprint retrospective.  This last sprint completed the final production-ready feature the team had been working on.  It was delivered just in time for the scheduled annual major product release.  Everyone was decompressing from having faced the possibility of delivery failure.  But even as we celebrated our success, there was a sense of disappointment that we had taken as long as we did.

I had been the team’s scrum master for about 6 months, starting up with them as this latest project began.  The team was small with 4 developers (3 full-time software engineers and 1 QA specialist) plus a product owner and a scrum master…me.  When I started working with them, the team was practicing scrum at a beginner-level.  My initial focus was mainly on getting the team to perform basic agile practices more effectively for estimating and planning, as well as for daily standups, sprint reviews and retrospectives.

Over the course of the project the team was dealing with regression failures that came back to them 1-2 weeks, or longer after their original code submissions.  The problem was not terribly taxing on the team in the early stage of the project.  We’d get 1 or 2 regression failure bug tickets and just included them in the next sprint backlog to fix them.  Sometimes we didn’t get to fixing a regression failure until 2 sprints down the road.  It didn’t seem like any harm to kick it to another future sprint.  It was tolerable…or so it seemed.

The team’s practice was to submit production-ready code after running a suite of “smoke test” regression tests.  The product itself was a complex CAD tool with over a million lines of code and up to 25,000 separate automated regression tests.  Running the full suite of tests was an overnight process.  Whereas, running a smaller subset of selected regression tests that focused mainly on the part of the code base the team worked in was a common practice among all our scrum teams.  It allowed for quicker turnaround. In general, it was felt that running the regression “smoke test” suite enabled everyone to deliver quicker at a relatively low risk to the product quality.  If a couple of regression failures slipped through the net, no one thought it was a big deal.  In fact, this practice was explicitly called out as part of my team’s definition of done for user stories.

But, the frequency of regression failures began to increase.  As we got closer to the release deadline, there were more regression bugs to fix, and the time spent fixing them consumed a greater portion of the team’s capacity.   As the scrum master, this issue did not go unnoticed by me.  I wrestled with the question of when it would be the best time to raise it to the team.  We were within striking distance of our goal and the team was focused on finishing the project and complying with all the acceptance criteria for the project.  Significantly, one of those criteria was delivery with zero regression failures.

About a week before we finished the project, I began reading Jeff Sutherland’s latest book “Scrum: The Art of Doing Twice the Work in Half the Time.”  I came to the chapter called Waste is a Crime, and a section called Do It Right the First Time, and the words leapt out the pages.  Sutherland gives a powerful example with Palm Inc.’s research on the amount of time taken to fix bugs not found and fixed right away (page 99).  The research showed that it took 24 times longer to fix a bug a week or more after code submission, than if it was found and fixed at the time the developer was actively working on the code, regardless of the size or complexity of the defect.  24 times!

So there we were at the end of the project, with everyone experiencing the elation and relief of a successful delivery mixed with a sense of disappointment that we did not finish as quickly or as cleanly as we had expected.  “We should have finished a month ago.” Why didn’t we?

It was at this moment that I jumped up and said, “hang on just a second.  I’ve got to get something at my desk that will be really interesting for everyone to hear.”  I bolted out of the conference room, ran to my desk grabbing the Sutherland book, and returned slightly breathless with the page opened at the passage describing the Palm Inc. research about bug fixing taking 24 times longer.  The gist of the Palm Inc. story was about one and half pages long, so I asked the team’s permission to read it aloud before we continued with our retrospective discussion.  Everyone agreed with some amusement and curiosity about what I was up to.  When I finished reading the passage, I could see the impact in the eyes of every team member.  Each member of the team began looking at each other recognizing this shared insight.  That’s the moment when I knew I had their attention.

I put the question to the team, “How many regression bugs have we fixed since we started this project?”  The answer was 35.  I had already sized the problem when I began monitoring it closely over the last 3 sprints.  I quickly showed them my query in the Jira database, displaying the search criteria and the tally of regression bugs on the conference room overhead projector.  Everyone agreed that it was a valid.

Then I asked, “On average, how long does it take us to fix a regression bug?”  We started opening up individual records so we could see our tasked-out estimates for each one.  Examples ranged from 8 hours to 16 hours typically including tasks for analysis, coding, code review, adjustment of some of the tests themselves to accommodate the new functionality, submission for regression testing and final validation.  Some took a little more time.  Some took a little less.  After a few minutes of review, the team settled on the figure of 12 hours or work per regression bug.  So, I did the simple arithmetic on the white board: 35 x 12 = 420 hours.  Then I applied the “24 times” analysis: 420 / 24 = 17.5.  I said, “If the rule holds true, then if we had fixed the regression bugs at the time they were created, in theory it would only have taken only 17.5 hours to fix them, not 420 hours.”  Then I doubled the number just to make everyone a little less skeptical.  35 hours seemed more reasonable to everyone.  Nevertheless, it was still a jaw-dropping figure when compared with 420 hours.  While I stood pen in hand at the white board, everyone on the team sat in stunned silence.  While they were absorbing the impact of this new insight, I took to the whiteboard again and wrote down 420 – 35 = 385 hours.  Then I reminded them of our sprint planning assumptions. “Based on our team’s capacity planning assumptions, we plan for 5 hours per day per person for work time dedicated to our project.  For the 4 of you that equals 100 hours per week of work capacity.”  I completed the simple arithmetic on the white board showing 385 / 100 = 3.85 weeks, underlining and circling the 3.85 weeks.  Then I pointed back to the retrospective item with the most votes, I said, “There’s your lost month.”

When our retrospective ended we left the meeting with a significant adjustment to our team’s definition of done.  We replaced the “smoke test” regression testing requirement with the practice of always running the full regression test suite on the code submitted for a story and resolving all regression failures before considering the story done.  This change was made with the enthusiastic and universal agreement of every team member.  Everyone recognized that it would take longer to finish each story up front.  But, they were happy to accommodate the extra time because now we all knew, without even the slightest doubt, that even though it would take longer to finish the story the first time, it was always going to take a lot less time than having to go back and really finish it later.

About Benjamin

Benjamin Spector
Benjamin Spector

Benjamin Spector has worked as a software product development professional and project manager for over 20 years.  For 4 of his last 9 years at Autodesk, Inc., he has worked as a scrum master for several teams and as a full-time agile coach introducing and supporting agile practices throughout the organization. Reach out to him on Linkedin

 

Scrum Sprint Commitment Rant

Estimated Reading Time: 9 minutes

Going on a Rant

If there’s one thing that makes me mad whenever I see it is teams abusing the commitment concept in scrum. I’ve been on a rampage against dysfunctional sprint commitments for a while now, but lately my thoughts have crystalized a bit, especially when I had a chance to discuss this with Jim benson, Alan Shalloway, Chris Hefley and Jon Terry last week at Lean Kanban Benelux 2011.

Background

So what is the problem? Well quite often you see scrum teams that finish sprints out of breath, out of quality, out of joy. You also teams that start the sprint full of numbing fear, set a low bar and that low bar becomes a self-fulfilling prophecy. Add to that Product Owners, Scrum Masters and managers all spending precious time worrying about whether we are able to make accurate sprint commitment, instead of working to improve the actual capability of the team.

It’s quite sad actually. Surely that’s not what scrum should look like and indeed other teams have energized focused sprints where they deliver what they can, stretch their abilities just the right amount and finish a sprint with just the right energy and mindset to joyfully go into the next one.

So what’s causing this?

Well, let’s start with the out of breath teams. It typically starts with unrealistic commitments they make in the sprint planning. They make those commitments either because they’re pushed to do it explicitly or implicitly. Yes, scrum says the team should pull according to their capability. But something about the way this all works de-emphasizes actual capability of the team and motivates them to try to take on more than they can handle.
With this in play, they start and since there is a lot in their sprint backlog they have the green light to start many things in parallel. A few days later, in the last mile of the sprint, it’s still many items in progress and it’s either an unsustainable effort to reach the finish line, cutting corners or having a very disappointing sprint result. In our #LKBE11 discussion we referred to those as mini-death-marches…

With teams living in fear it is a different but related story. It starts with the message/spirit conveyed to them by their Product Owner, managers or previous life management culture. When they hear commitment they hear “miss that and you’re in trouble”. And if the ecosystem is such that meeting the sprint commitment is more important than the overarching purpose of the project/release/feature they will be driven to satisfy what they perceive as important – being predictable at the sprint level. So they make a safe commitment. Usually this is achieved by taking safety in the estimates. And so starts a self-fulfilling prophecy, as described by Parkinson’s law and Donald Reinertsen’s principle of the expanding work.

It doesn’t help that the team thinks that if they are able to deliver more, there is no turning back – from that point on they will be asked to deliver more on a consistent basis.

Lets pause here for a second – Isn’t it a reasonable expectation? Shouldn’t the team commit and deliver more in the future if they’re able to? The problem is that even during a short 1-4 weeks sprint, there’s still a lot of unavoidable uncertainty and variability. In exactly what we need to accomplish (requirement space), in how to do it (problem space) and also in how much time will we have for it (capacity). A lot of teams try to eliminate this variability and spend a lot of effort on it. Planning meetings grow longer, people’s capacity is planned at the micro-level…

Many teams will oscillate between over-commitment and under-commitment exactly because of this variability of course. They and their management will be frustrated if they’re measure for effectiveness is meeting the commitment. The only way to consistently meet a commitment is either unsustainable pace, or making a really safe commitment.

Lets eliminate commitment

Well, just as an exercise for now, to see why it’s there in the first place…

Without a sprint commitment, how will the sprint look like? Probably we will see people taking on work from all over the place. They will start at the top priority, but their nature will lead them to start many other backlog items since there is no focusing force urging them to stop starting and start finishing. So we need commitment, or something else, to encourage a team to focus on a few things and finish them first. An alternative to commitment at the stories level is to say we are focusing on a single feature so let’s finish it before moving on to anything else.

Commitment as a Focusing mechanism

Wait – this is the Scrum Sprint Goal – Teams are supposed to agree on a Sprint Goal they will focus on. The detailed story level commitment is an elaboration on that anyhow. If our product backlog is very fragmented and not feature oriented we will have a tough time using an effective sprint goal though. This is something to wonder about in and of itself… but if it’s indeed the business reality that we are doing many small things, we need another focusing guidance. That guidance can be “we think we can finish at least 8 stories, hopefully 4 more, so lets start with 8, get a good feeling we can finish them, and ONLY THEN move on to the 4 others”. Here, the team is still using the sprint commitment, but they’re using it for themselves as a focusing / work in process limiting mechanism.

Containers

Another problem we might have without commitment is that the work will expand uncontrollably. There is no finish line so there is no container. One thing that might help is very energizing purpose of where we need to get at the end of the Feature/Project/Release and why it needs to be at a certain point in time. Seeing our progress towards that goal (or lack of progress…) will help energize our efforts and reduce the expansion of work.

Commit to Capabilities Improvement

Another thing that might help is to start looking at our capability as a team and make a commitment not to exactly what we deliver but in general to improve our capabilities. The capability we care about is velocity as well as ability to turn out the top priority items in the backlog as soon as possible since they are the highest priority. So let’s monitor our capabilities over time and try to make them more predictable first and improve them as a next step. Specifically, measuring Velocity can be done without making any sprint commitment. Just track the velocity for each sprint, preferably on a control chart so you can start to understand the variability in your capabilities.

How can we make promises without commitment?

This is a point I love. On one hand Agile diehards say there is no commitment in agile – “we will just work sprint to sprint and avoid any clear external commitment the business can count on”. On the other hand if you start a discussion about losing the sprint commitment they and others start talking about “how can it even work without the team making a clear commitment and sticking to it?”. Bottom line, the sprint commitment doesn’t help you one bit in making external commitments and meeting them. It’s simply orthogonal to it. You make external commitments based on size estimations and historical/estimated capabilities. You meet external commitments by monitoring where you are towards them and adjusting scope, resources, pace sprint by sprint. If you use the sprint commitment as you should, it gives you nothing towards that goal. Accuracy in sprint commitments is micro-predictability. The business cares about mezzo/macro predictability. Same like a long-term stock investor doesn’t care about the fluctuations within a day or a week, they care about the stock performance over a quarter or a year. The team should care about reducing variability in its capabilities eg. have a lower variability in Velocity, so more aggressive mezzo/macro commitments can be taken on while still allowing safe and sustainable delivery.

How can other teams count on us if we don’t have a clear commitment for the sprint content?

What if we are in an environment where other teams in the group/portfolio count on deliveries from us on a sprint by sprint basis? If we don’t have any commitment how will they know when to expect the delivery from us? If they intend to work in parallel to us, how will they know whether to plan for this or not?

There are a couple of ways to look at this. If 80% of the work is consumed by other teams then we should probably consider the organizational design. Maybe it would be better to work as a single team. Maybe it is a case of us providing a service that is consumed by many other teams, and then it might be better to move towards a pull system – where there is less reliance on dates and rather an agreement on priority, an understanding of the capability in the form of typical lead time from requesting a service from us to the time we deliver it, and then the consumers using that service whenever it is ready, either at their next sprint, or even better as soon as its ready. If you’re thinking this will make planning sprints more complicated and prone to changes you are right. The solution can be to move to full pull mode at the team level, or reduce the batch size you plan for, meaning shorten the sprint length.

If it’s just sporadic work that others depend on, make sure that is what you start with and make a commitment to deliver it. I wouldn’t be surprised if the term Class of Service comes to mind at this point…

What will be the engine of continuous improvement if we don’t have a target commitment to strive for?

Scrum is about Continuous Improvement, right? What drives this? Isn’t it the need to meet commitments? to be better about commitments?

Well, not exactly. The thing that is driving Continuous Improvement is the fact that there is a container, composed of a certain scope to focus on, a certain time to do it in, and the people/capacity to do it with. Think of circling the team with a rope telling them now move together towards the target. This will cause a lot of pain. Some people are faster, others are slowing the team down. Some impediments come up and cause problems. But the rope keeping the team together is forcing them to deal with the problems rather than defer them by making progress on things outside the container just to maintain the comfortable feeling of progress.

So in order to maintain this improvement-inducing container we need the time, the team, and a certain scope to focus on. We can do that with the Sprint Forecast mentioned before.

One important concept in Continuous Improvement is to have a vision / target condition to strive for. What is that target condition in a Scrum environment? As mentioned above, this typically is to improve capabilities.

Improving throughput/velocity requires more scope in each container.

How do we translate improving business agility to the container? The ability to define a shorter time frame that the team can still deliver in. The shorter the time frame the more opportunities to change direction without causing waste. Problem is that there is a limit to this. Work takes time, and there’s a limit to how small we can slice it to still be able to use a container of this structure. That is why, at some level, in order to improve business agility even further, we need to move to another form of container, one which limits the amount of things we are working on as a team at each point in time.

(Clarifying note – If you’re reading this to mean get to a certain level with Scrum then move to Kanban, that’s not what I mean. You indeed will benefit from Kanban at this level, but you can start your journey with Kanban in the first place, or move to it regardless of where you are on the way)

So can we get rid of the Sprint Commitment or not?

Well, my personal opinion is that we can live without a Sprint Commitment as currently practiced by the majority of Scrum Teams out there. It seems the creators of Scrum think along similar lines, as they replaced Sprint Commitment with Sprint Forecast in the latest Scrum Guide

I personally think commitment is important, it’s only a question what you commit to. I prefer to focus on the following types of commitments:

  • Commit to learn about your capabilities, care about them and continuously improve them, by using a focusing mechanism challenging the team as a whole.
  • Commit to deliver the class of service that the business and other teams expect, which means delivering on time when it matters, delivering the most throughput when it matters more, etc.

 

Some more ideas to try at home…

Before we conclude this long post – Some related experiments you might want to try at home…

  • If you feel you are over stretching, For a few sprints try setting a very low forecast and meeting it and see how it looks like. Talk about it. Learn from it.
  • Try limiting the amount of Features/Goals in one sprint. Talk about what it changes in the energies and focus of the team. If you cannot set a limit, that’s an interesting discussion in and of its own, that you should have.
  • Use the Sprint Goal and Sprint Stretch more aggressively. Set a lower goal, and commit to deliver the goal first, and as much of the stretch as possible. Goal should be something you can consistently deliver 95% of the time. (Mike Cohn recommends basing that goal on the mean of the 3 worst sprints out of last 8, another way is to use 2 standard deviations below the mean if you want to take a more statistics oriented approach). whether 95%, 85% or lower is your call. But the expectation should be that if there is a difficulty meeting even this commitment, it’s not forbidden to pick up the pace a bit in order to meet a commitment. Learn from it at the end of the sprint and plan more effectively next time.
  • Read about the XP Planning Game and try it… Seems the idea that iterations can be effective without a commitment is not a new one 🙂

Extra Reading

Conclusion

Scrum has some good things going for it. The Scrum-style Planning Game and Sprint Commitment as currently understood and practiced by most teams and organizations is not one of them. I hope this post will help at least some of those improve their results as well as their happiness.

Want my elevator-pitch answer to what is Kanban for a Scrum rookie?

Estimated Reading Time: 2 minutes

 

Our coaching team at agilesparks runs into this question a lot. 

Many of the teams we are working with are familiar with Scrum and using it. Other teams are just now going into Scrum. 
Since kanban is becoming a hot buzzword, we often get asked – so what is this kanban thing? How is it related to Scrum? 

We needed a good answer, that depends on the context, the amount of time you have to answer, and the maturity of the person/forum asking.

In this post, I will try to give the answer you give when someone finds you in an elevator, the last 2 minutes of a workshop, or on the way back from lunch, in short both you and him have a very short time to give an answer. 
Add to that that his knowledge is quite limited. 

Here goes:
"What you might have heard about kanban is that its scrum without sprints. 
I would say that Scrum is an agile approach where the container used to protect, focus and challenge the team is the time-boxed sprint. 
Kanban is another Agile approach! In Kanban the container used to protect, focus and challenge is limiting the amount of things we do in parallel – Limiting the Work in Progress. If you need to remember one thing – remember and lookup Limit the WIP"

If you are in a very high building, you can also add:
"Mixing the two can lead to beautiful results – called ScrumBan. Also one of the biggest differences is in how an Agile change usually looks like with Scrum/Kanban. Scrum is a revolutionary big change up front approach. Kanban is more of an evolutionary laser-focused approach where you find where to focus (using the WIP limit as the challenging force), do something there, continue to the next area to focus on. If you've heard of TOC, its quite similar in how it manages change. "

Now all of this is very simplistic, but probably concepts like Cycle Time, the Lean origins, and other Kanban goodies are too much for a rookie with very short attention span at the moment. 
The important thing is to grow an interest for what this WIP limit means and look it up.