Py's Sty: Unit Testing

Just last week I was approached by our BA(-ish roled individual) who was tasked with documenting our testing process, and in particular our unit testing strategy. He wanted to know what our unit tests did, and how he could express their value to the client. I had just finished identifying an interesting issue that was identified as I wrote unit tests to exercise a recent feature, and it made the perfect example of what the unit test suite did, and how it added value to the project.

A bit of background into the issue:
The business requirements were that a service would be contacted to retrieve updated events for an order. All events that weren't already associated with the order would be associated with the order. The application would report on any new events associated with the order.
*events have no distinct Key from the service, they are matched up by date/time and event code.

The code essentially did the following. *returning IEnumerable

var recentEvents = EventService.GetEventsForOrder(order.Key);

var newEvents = recentEvents.Except(order.Events);

foreach (var newEvent in newEvents)

{

    order.Events.Add(newEvent);

}

return newEvents;

Based on the requirements I scoped out several tests. Ensure that:
Given the event service returns no events, no new events are added or returned.
Given the event service returns new events and the order contains no events, all events are added and returned.
Given the event service returns new events and the order contains one with matching code and date/time, only the new unmatched events are added to the order and returned.
Given the event service returns an event with matching code but different date time, the new event is added to the order without replacing the existing event and returned.

Adding the 2nd test I was suprised to find it failed. The code is de-coupled and the EventService is mocked out to return two events. There were 2 main asserts, one to assert the order contained the new events, and the second that the return value contained the new events. The order assert showed it contained the two new orders, however the returned set had a count of 0. I'm actually pleased when a test does fail, but this was a momentary "WTF?" followed shortly by a "well Duh!". I was returning the IEnumerable from the Except() expression and I had added the items to the list driving the Except; any further iteration of the IEnumerable would see no items since matches were now in the list.

The issue was easy enough to fix with a ToList() call, and I felt it warranted a simple comment to explain why the fixed code was done that way in case someone in there later went and tried re-factoring it out to just use the enumerable.

This gave me a perfect situation to demonstrate to the BA exactly how the unit tests reflected requirements within the application, and how they served to guard those requirements from unexpected future changes. I had integrated the two tests with working code to show how the CI environment reported that all tests pass, then re-introduced the buggy assumption to show how the tests pick up the error.

The other interesting thing to note was that I also showed the BA our NCover results and was a bit surprised to see that with just the first two unit tests that block of logic was reporting back a test coverage of 100%. However I showed that the "touch" count was showing a lot of 1's indicating that a single test was touching many parts of the code only 1 time. This is a warning flag to me that the code really isn't being exercised . If I was solely concerned with test coverage percentage I could have left the tests at that, but I knew the other two scenarios were not represented so I demonstrated how the stats changed by adding them. The test coverage remained 100%, however the touch counts increased to show a minimum of 3, and an average touch count of 6 to 9. It's not a perfectly reliable indication that the coverage really is complete, but by thinking through the scenarios to exercise the code, I'm quite confident that the code is reasonably guarded from the unintentional introduction of bugs.

Overview

This document is geared towards providing an outline to producing effective unit tests for functionality using inversion of control principles and mocking frameworks.

There are a few different, and sometimes conflicting definitions of what is a mock vs. what is a stub. Generally a mock performs validation about if/how a mocked method is called, where a stub is merely a placeholder to return a canned result to a message. For the sake of simplification, this document will use the term mock to represent both mocks and stubs. A mocking framework can easily accommodate both mocks and stubs.

Inversion of Control and Testing
One of the key benefits of adopting Inversion of Control (IoC) is the enablement of testing classes in isolation. Take for example a scenario where we have a class responsible for providing functionality around domain objects. There is an additional class that provides validation of business rules such as permissions, and another class that manages interactions with persistence.

We could write this scenario similar to below:

It’s pretty straight forward where there is a good separation of concerns between objects. But how do we test the LocationProvider or LocationValidator class? LocationProvider is dependant on LocationValidator and DataContainer. LocationValidator is dependant on DataContainer. Assuming that data container has some method of being configured with a connection string, all we can hope to do is set up a test case to ensure that DataContainer points at a test database and the test setup should probably be inserting records into that database to ensure that the expected data is present.

Another important consideration from above is that any test we would write for LocationProvider would now be dependant on the behaviour of LocationValidator. This problem is compounded as the number and depth of dependencies increases. Our test case for LocationProvider would need to set up data in order to ensure that the Validator passes, or fails as expected. What happens when the Validator rules (behaviour) change? Our LocationProvider unit tests, and any tests that are dependant on the LocationValidator or the LocationProvider will also fail because they would need to be updated to set up data according to the new rules.

This scenario may seem extreme, but it is the primary cause behind the frustration and eventual abandoning of unit testing. Tests become too complex to set-up, and far too brittle to be written early in the development cycle. It may not be as extreme as this where data is the central contention point for unit tests. Developers experience and then fear situations where changes to one class start breaking unit tests all over the project. This creates “noise” in the test suite that takes time to understand and fix all of the affected tests.

It certainly doesn’t need to be like this.

Using IoC to De-couple Dependencies
Inversion of control provides a mechanism to de-couple dependencies from one another, allowing each dependency to be tested independently. IoC can be implemented a number of ways, such as providing dependencies in the constructor, setting dependencies via properties, or passing necessary dependencies as parameters within the method calls themselves. You could define dependencies as concrete instances, but for the interest of test-ability the dependencies will be defined as interfaces so that they can be substituted. The following example changes the scenario to provide dependencies via the constructor.

The code around the functionality looks pretty much the same as the original example. The key difference is that the instances of the dependencies are now provided when each class is constructed. Now we have something we can test.

Testing in Isolation with Mocks
Lets first look at writing a test for our Validation class using Moq. (2.6) *Note: for newer versions of Moq (3.0+) substitute the “Expect” method name with “Setup”.

In this example, we’ve created a stub object for the location (1), and another for the DataContainer dependency.(2) The data container is configured to expect a meth call to “CountObjects” and is instructed to return a non-zero value. (3) We create the validator passing it in our stubbed data container. (4). The test then asserts than the validator returns back “False” based on the results.

By definition, the data container substitute that we’ve created is a stub, not a mock. The reason is that this object doesn’t validate that it was actually called. This would be better served by a true mock instance, which can be done by modifying the test as follows:

Converting this to a true mock by adding the verification is important because it ensures that the CanDelete method actually does call the data container. (The business logic says it should.) The mocking framework would automatically detect if a different method was called, or different values for parameters, but it doesn’t report back whether a call actually occurred unless we ask it.

A note regarding the naming convention for the test: This example unit test is named to reflect the behaviour, or business requirement of the code being tested. This is in the spirit of Behaviour Driven Development. (BDD) By testing around the behaviour, the unit test itself becomes part of the design for the given system. It defines a required behaviour, and as an integrated part of the project, anyone seeking to change to the requirement (test) or implemented behaviour (code) will have to ensure that the two reflect one another.

The next obvious test scenario is that CanDelete returns True if user locations are not found.

More on Mocking Frameworks

Mocks and stubs can be hand-rolled by creating test classes that implement the various interfaces. Mocking frameworks like Moq are invaluable because they hide the messy extras of implementing interfaces where you’d need to include signatures for every method in the interface. With mocking frameworks you only implement the interface methods you want to use. The IDataContainer may eventually have 20 or 30 methods in it but with the mocking framework your tests can safely be written around only the methods you care about. If you hand-roll a stub class, as you extend the interface you’ll need to continuously go back to the stubs to add place-holder methods to match the interface.

There are a number of mocking frameworks available for .Net including Rhino Mocks, Moq, NMock, and EasyMock.Net. Rhino Mocks is arguably one of the oldest and feature-rich. Moq is a “back to basics” mocking framework written around .Net 3.0 offering an elegant, simple tool for defining stubs and mocks. My argument for going with the basic and elegant option is that unit testing should strive to be as flexible and non-intrusive as possible. If tests adopt complex mocking behaviour you end up spending a lot of time trying to build tests to assert what really should be basic concepts. If you find you need to mock out complex scenarios then you either need to take another look at how you’re structuring your dependencies, or you’re testing far too deeply into dependency layers.

Testing Layered Dependencies
The real power of using IoC coupled with mocking presents itself in scenarios where dependencies are layered. Lets look at the case of testing the LocationProvider class. This class had a method to perform a delete that would ask the validator then instruct the data provider to perform an action if the validator gave the go-ahead. We already have unit test coverage for the validator to ensure that it does the right thing with inputs and outputs, so as far as tests for the location provider go, the validator can be mocked.

We create our stub object again (1) and a mock validator. (2) We instruct the validator to expect a call to CanDelete with our stubbed location, and to return True giving the go-ahead to delete. (3) We instruct the data container mock to expect a call to DeleteObject, provided with our stubbed location. (4) Note that these calls will fail if either call is made with anything other than the reference to our stubbed location. We create an instance of our provider class providing the two mock dependencies, and call the Delete method with the stubbed location. (5) The final step is to verify that our mock methods were actually called.

Notice that we don’t need to add any complexity around trying to ensure what the validator does, where the data container would be expecting count requests and such. That behaviour is already tested. All we care about is that a delete goes ahead when the validator is satisfied.

The next scenario to test is that a delete *doesn’t* occur if the validator reports back a no-go condition.

The data provider won’t be called if the validator returns false so that obviously won’t pass if we try to verify the call, so we removed that line and the verify. But wait, now this test doesn’t actually test anything other than that the call to the validator occurs. This is ineffective. What we really want to do is verify that the data provider *doesn’t* get called in this scenario:

To accomplish this, we put the .Expect call back in. But instead of doing a verify against it, we instruct it to throw an exception. This will cause the test to fail should the data container be instructed to delete an object even if the validator said no-go.

This type of enforcement with mocks is invaluable to help detect and prevent aggravating bugs from working their way into the system undetected until late in the development cycle, or worse, when the product makes it into production.

Lets look at a simple example of how this kind of thing could happen.

Cowboy Coding
Meet Cowboy Cody. Cody’s a great coder, quick as a whip. Cody starts off running with his new enhancement, to add a condition to change some rules around deleting locations. He looks at the existing code and gets as far as the Location Provider spotting the method he wants to change. “What’s this validator thing? Aw, hell, that kind of stuff isn’t necessary, all I want to do is add this one little condition. I don’t want to go through the trouble of adding a parameter so it’s going to be as easy as adding a module-level variable. Haha, 5 minutes and I’m done!”

Lets see what damage Cody’s done…

At a glance it looks harmless enough. Cody was at least good enough to go into the application and try out some of the conditions around his new 5-minute flag. He’s checked his code in.

Prudent Pat, the team lead, sees that changes have been made to the project and he immediately runs the test suite before starting his own enhancements. Red Light! “Ensure Delete Doesn’t Occur If Validation Rules Are Not Satisfied” has failed, DeleteObject has been called!

It turned out that the default for _someOtherCondition was “true”, and Cody has now introduced a behaviour change bug where his added condition can override the other validation rules. Pat consults Cody and Cody admits “Whups, that should have been an ‘&&’”

The next morning, Cody’s assigned the task of “fixing” his change properly, in the right place, and correctly ensuring the unit tests reflect the behaviour.

Conclusion
Hopefully this is a useful guide for how to effectively apply IoC principles in combination with a mocking framework to write unit tests. Writing unit tests can often be a bit of a daunting task but gradually you will find the quality of the tests you write goes up, as the time spent writing them and managing breaking scenarios goes down.

One Final Mention
Unit testing is nothing that should be dramatically increasing the amount of time you spend writing code. Every developer factors in time to start up a project as they’re working and try out their changes as they are working on them. Adopting unit tests is taking the majority of that time investment and using it to create tests that are repeatable. The initial investment is larger than the time you might spend trying out a change, but hopefully you aren’t in the habit of only trying things out when you think you’re “done”. The cost of running existing unit tests to ensure you haven’t accidentally broken anything is a lot less than manually running through existing functionality as you introduce new functionality.

Py's Sty

Sunday, September 11, 2011

A recent example of the value and purpose of unit tests.

Monday, February 1, 2010

Unit Testing using IoC Principles and Mocking/Stubbing

Sunday, January 24, 2010

The cost of TDD / Test-soon-development.