Py's Sty: December 2012

This is a question/discussion that crops up again and again. Looking back at my earlier post I thought that perhaps it would be more clear with some numbers. By "TDD" I refer literally to "test driven development" in the sense that unit tests, whether written before, or after behaviour, are a driving consideration for development.

The trouble with pitching unit testing to developers almost inevitably comes down to "it takes too much time."

Lets say you spend one hour developing a nice, clean, atomic feature. How do you know it works? You fire it up and run through a scenario or two. How long does that take? 5 minutes? Maybe you find a problem so you spend a total of 15 minutes debugging and running through a healthy set of scenarios to ensure your code meets all of the requirements. This is assuming you're a competent software developer that's focused on doing the right thing. You're checking to ensure that the code does what you intended it to do.

Now how long would it take to write unit tests in addition to that manual check? 20 minutes extra? Sounds like a lot, 25 minutes per hour vs. 5 minutes, best case. But let's work with it.

How many hours-worth of features are you going to write in a given two week period? 80? Surely not, I generally get around 60 productive hours in a good iteration. Spending 5 minutes per hour ensuring your code works would give you 55 productive feature hours. Spending 25 minutes per hour would give you 35 productive feature hours.

But hang on a moment. Let's look back at our first scenario. We spend 5 minutes ensuring that our code does what each requirement is supposed to do. What happens when we change the code to meet the next requirement? Are you sure your previous requirements are still met? Granted every feature is not dependent on every other feature, but lets look at the worst case scenario for the heck of it. Those bugs do love to creep in right where they should be inconceivable. By all means, the first feature requires 5 minutes to test, the second will require 5 minutes plus 5 minutes to regression test the first feature. The third feature will require 15 minutes, the 4th, 20 minutes. By the fifth feature we've matched the per-feature cost of unit testing. We're still ahead, but there's a long way to go and it's only getting more expensive to be sure. By 8 productive hours we're spending 3 hours testing. By 16 hours, we've spent over 11 hours testing. By 28 productive hours we've run out of time. Over half of our time is spent regression testing.

And this is just the first iteration. It only continues uphill from there. Now obviously not EVERY feature needs to be fully regression tested and require 5 minutes to test. You make a judgement call to determine risk vs. cost. There will be times it takes more than 5 minutes to test something, and less than 5 minutes to regression test it. But then there is the time needed to record just what needs to be regression tested. Unit tests will often cost MORE than 25 minutes to write, but in the end it doesn't really matter how long they take because sooner, rather than later, that regression cost and risk is going to catch up. Are you really that confident that after six months of development with 4+ developers touching the common code that every requirement you've written still works the way the client intends to see it when you release? Oh yeah, there's a regression test before release (we hope) but how many issues do you think they're going to find when that happens, and how much time are you going to spend tracking each and every deviation down? Testing has a nasty way of starting before the developer's pencils are down leaving the door truly open to bugs shipping if developers aren't taking responsibility for asserting their changes don't frak up something else.

But now what about the cost of change? Unit tests get in the way. When the client changes their mind and you try changing stuff, tests break and it's more work for you. However, unit tests aren't what are expensive, it's change that is expensive. All unit tests have done is helped show you to cost of that change up-front. Those 100 unit tests that broke when you changed that behaviour: Those are the 100 requirements that function based on the original behaviour, and you've just invalidated them. What were you intending to do about those features? Assume they would "just work"? Hope that a regression test picked up those issues? (But won't that still cost you a lot of time and head scratching to track them all down?) Didn't you expect those 100 areas might depend on that change you're about to be making? Or, perhaps, shouldn't you be glad that *you* see the impact of that change before your client does?

Py's Sty

Monday, December 10, 2012

The cost of TDD: Numbers