27 May 2012

The Agilist's Dilemma

Let me tell you a dirty little secret: An Agile approach to crafting software comes with its own challenges.

I'm guessing you already knew that.  Does the following tale seem familiar?

Fields of Green

Imagine you start developing fresh new "greenfield" software today using the basic Scrum process framework (for example), and little more.  The developers design and build just enough to complete the commitments of the first iteration; the testers test it, and the demo goes smoothly.

This continues for a handful of iterations.  The team starts to "gel," gives itself a name that has little to do with the current project, holds meaningful meetings, and thrives in its collaborative team space, and even starts to handle its own personnel issues without escalation. The Product Champion (Scrum "Product Owner", XP "Onsite Customer") is engaged, enthusiastic, and encouraged by the increments of real value demonstrated in each Demo. All is sunny!

Then a subtle shift occurs. Defects are found. New functionality starts to take longer and longer to build. Developers discover that with each new story, they're having to reshape their design to accommodate new functionality without breaking old (i.e., they must "refactor"). But folks outside of engineering can't see this: Product folks may worry about the cost of this "refactoring" stuff. Testers know they have to test the new stories, but also every old story as well. But they're understaffed (as always) and start to accrue a separate backlog of stuff to test.

Eventually, real throughput (valuable new functionality with an acceptable level of quality) in each iteration comes to a dead stop.

The results of the Agilist's Dilemma
That's what I call the Agilist's Dilemma: The very fluidity that we espouse has become an impediment. "Had we only planned and designed for this! If only we knew what all the test cases would be! We should go back to a phased approach!"

Okay, calm down, everyone.  Let's not get all crazy. Remember the good old Death March? The all-night bug-fixing parties? Remember when the classic phased approach destroyed tech companies at a break-neck pace? Ah, yes, the Good Old dot-BOMB Days.

The dot-COM fiasco actually happened after the inception of Scrum and Extreme Programming, but people viewed Agile methods as strange, chaotic, and fringe.  That those methods had conceptual roots in Deming's work from the 1950's actually didn't help our case at all:  People love to find excuses to dismiss Deming.  Plus there are always high-profile examples of organizations that are very successful without adopting any particular form of process whatsoever. There are always the lucky few.

We know that Agile is about not trying to predict the unpredictable future. It's about seeing what's happening early, so these problems can be repaired while they're small. And it's about winning (creating value) within this iteration (or release), while setting ourselves up to win in the next iteration (or release). It's all supposed to provide the ability to deliver a predictable stream of value while remaining adaptable to the changing tides of business and customer demands.

And that's where our imaginary team is now:  If they (developers, testers, product, and leadership) are sensitive enough to the signs, they'll catch this early, and try something to fix it. That's exactly why Agile methods work:  We detect trouble early enough to do something about it.

The Elusive Silver Bullet

We know there are no silver bullets; no quick fixes. Yet most teams and organizations spend a lot of time covering surface symptoms with band-aid techniques, with the hope that they can just hold it all together through the next release. After that, "we'll take the time to fix everything that went wrong." Some of these techniques: hiring more testers with less experience with the product, mandating overtime for developers, and shipping on time and hoping the customers won't notice the quality issues. All of these band-aids are counter-productive, and subtly lay the blame on the people doing the work.

I think this occurs because of two fundamental problems:  Most people are unaware of the subtleties of the system within which they find themselves, and most are driven by metrics that don't relate directly to the health of the whole system. Even at the highest levels of management, often people are partially blinded by their own "performance-based" bonus check. When people are forced to compete within the organization for perceived scarce resources, the whole organization suffers.

I'm not suggesting the fault lies with management, either.  I've heard that the Lean approach is to "blame the process."  I find that blaming is like hoping:  A total waste of time. What all the Lean/Agile/Scrum literature is really suggesting is that we realistically examine and adjust the system.  Scrum gave us perhaps the most pithy and memorable phrase: "Inspect and Adapt."

So we take the time to dig a little deeper to locate those "root causes": Those things that are generating the myriad symptoms.  "Why did that happen?  What's really going on here? What can we do (or stop doing) to help resolve this?"

So we stop blaming (or discouraging) Product folks for wanting the world, or testers for finding and reporting defects, or developers for not being able to type 500 WPM. We model what's currently happening, locate where value is really getting bottlenecked, and then take a single step towards alleviating that bottleneck. Repeat.

And?

I do have some changes to suggest: Techniques that have worked for numerous teams, and work on collections of root causes that commonly plague software development.

I'm headed somewhere with this post, obviously, but I'm trying something new here (new for me):  Using a blog post like a short chapter or section, rather than having each blog post represent a complete, stand-alone article.  Maybe I'll get more written that way. ;-)

So, please "stay tuned."