Py's Sty: February 2010

Sunday, February 28, 2010

Descriptive Code

I'm not a big fan of the term "self-documenting code" mainly because it seems to light the fires under both extremes of whether it's nobler to comment everything, or wiser to insist all code expresses what it does perfectly well. As with everything, moderation is the key.

I've participated in project teams that believe that every method in every class should be commented, and in others where I was told in a code review to remove all comments. At the time I was a bit confused why I would have been told to remove comments, but over time it did begin to make sense.

The objective of self-documenting code is that the code should describe it's intent without the need for additional external documentation.

My original attitude about commenting code was that while anyone reading code should be able to understand "what" the code does, there should be comments to reflect "why" the code does what it does. However, the trouble with commenting is that as you re-factor code, the intent of the code can change from what was originally commented. "Comments Lie."

On the other end of the spectrum, commenting everything is a complete waste of time and just leads to copy & paste errors, and more lies. Why do you need a comment like this on something like a Customer Presenter?

/// <summary>
/// Saves the provided Customer.
/// </summary>
/// <param name="customer">Customer to save.</param>
public void Save(Customer customer)
{
//...
}

Ok, if your intent is to be able to produce documentation about an API, by all means comment public interface methods and comment them well; but for general code? This is just a waste of time. Most of the time when I see this in practice there are plenty of examples where the comment headers aren't accurate anyway; referencing parameters that were removed, or missing parameters that were added.

I still stand beside my original belief. Code by itself is misleading, it only tells half of the story. I can see what it does, but not why it does it. The best advice I can give regarding self-documenting code:

1. Code should be organized into distinct units of work. It should do one thing, and one thing only, even if that is just to coordinate other distinct actions. The name of that method should perfectly describe what that code is supposed to do.
For example: private void updateOrderStatusToDispatched(Order order)
This method name perfectly describes what you intend its code to perform. When this method is called it is crystal clear what you are intending to do.

2. Reduce, or better eliminate the use of conditional logic in your code. Nesting ifs/else ifs or a switch statement is a good indication that you can break code down into smaller units of meaningful work. 90% of the "if" statements I use in code are for fail-fast assert behaviour. If the method should do nothing based on the parameters it's given, return now. If a parameter isn't valid, throw an exception.

Where you do have conditional bits for a reason, be sure to use a comment to describe "why" that condition exists. Too many times I've seen code break when some conditional term was added for one particular case which was not documented, it caused a bug in another case, so the bug fix of removing that condition fixed the one case, but broke the original. You might remember why you added an "if" condition around a particular block after 1 month, or maybe 3 months, how about 6 months? two years? And what about the other poor souls that are working on the code base with you?

3. Write unit tests around the behaviour of the functionality, not specific code implementation. Something adopted from BDD (Behaviour-Driven-Design/Development) is that a unit test can represent the original functional requirement for the application. In this way, the code being tested is documented with the "why" it is implemented the way that it is.
For example:
[Test]
public void EnsureThatWhenACustomerConfirmsAnOrderTheOrderIsDispatched()

This test method would not test a method like updateOrderStatusToDispatched() directly. The update method is a private member, but it does test the behavior of when a customer confirms an order and asserts that the expected behaviour of that action is that the order should be dispatched.

Methods such as updateOrderStatusToDispatched() serve self-documentation by describing precisely what their intended behaviour. By incorporating unit tests that represent the various requirements of our application, our code is now documenting itself in ways that are virtually impossible with separate requirements and design documents. If I change the behaviour of the code unexpectedly, my unit tests break telling me I changed behaviour. If I need to change or add behaviour, the changes to my unit tests assert those changes as I re-design the code implementation. Keeping a separate "Word" document up to date for a software design is frustrating by comparison.

Wednesday, February 24, 2010

It's official. I HATE Reporting Services.

I started working with Reporting Services about 2 months ago. Coming from a Crystal Reports background, and all of the trials and tribulations that go along with that, I was skeptical but optimistic about working with an alternative.

I put Microsoft's earlier attempt to provide a reporting solution other than Crystal with Visual Studio behind me (which ended up little more than a glorified Print Form) and dived in.

At first I was liking what I saw. Shared data sources was a little bit twitchy, but overall it worked well, and deployment to the report server was a breeze, either through Visual Studio or the web interface. Bringing up a report in a pop up window was a snap. It seemed a bit "wrong" that users would need to install something in order to print those reports. I don't see where/if/how/why HTML rendering wasn't provided. But in all, getting some simple reports out to local users was quite pleasant. Then the trolls started lurking out from under their bridge.

The first point of pain was when I had to generate a rather complex report similar to a cross-tab. Getting the data into the report wasn't a problem, but keeping the report confined to a single page proved to be a challenge. The issue was due to some form of moronic layout engine Microsoft adopted for the report designer. I had a set of text and calculated fields up at the top of the report (my report header) and a tablix control well below them. The tablix control would dynamically create columns based on the report data so it was sized quite small. For some reason the controls starting right of the tablix would shift to the right to always be printed after the end of the tablix control. I could understand this if the controls somehow overlapped, but these were up at the top of the screen. The result was having header controls pushed off onto a second page

No issue, I should be able to just create a section for the report header to stop confusing the layout with the tablix control.... There is no concept of sections in the designer.
My only option was to place these controls in the page header, which in this case is not a major issue but it doesn't give me an option to use a report header.

The real kick to the teeth with Reporting Services is the draconian lock-down making it virtually impossible to expose a report in an extranet capacity. I can understand arguments for not wanting to publish reports on the Internet, and by all means, educate developers to avoid such configuration, or how to do it securely, but to just lock the option out completely? Come on! I have an external site that *isn't* part of our network that needs to get access to the reports. No, we don't use VPN tunnelling or want to set them up as part of our network in any capacity. We just want them to access a website (which is locked down by IP filtering) and be able to click on a link to bring up a report. But it's effectively impossible with Reporting Server 2008.

IMO A reporting service should just GENERATE REPORTS. Let me take care of who has access and how they access it; Reporting services, just produce the bleeping report.

Saturday, February 20, 2010

Frameworks are like marrying your Sister

I came across a rather interesting article of an opinion of ORMs & DBMSs here which is a bit of a push away from the old mentality, and considering new options when it comes to persisting data.

If using a DBMS is like kissing your sister, designing & building "frameworks" should be like marrying her. It might be a bit too strong (as one should never marry one's sister) but 99.5% of the time development teams come up with the idea that "we should write a framework" they really, really should think again. I'm of course talking about development teams who's job is to write "application" software, *not* development teams who's primary goal is to write a framework. (Ala DNN, etc.)

Frameworks are a good thing when your goal is to apply a consistent approach to a problem domain. When you invest in a framework, the reward is to utilize it in as many ways, and as many times as to make the investment worth-while. It isn't worthwhile to create a framework that is used once, twice, five times, or even ten times. If the framework can be used in 100 different projects or more, *then* it is worth while. Time is the key factor in why application development teams should *never* waste time trying to write frameworks. How long will it be before that framework would be used in even 5 projects let alone 100?

Time is every framework's enemy. Technologies are constantly changing and it takes quite an effort to keep any piece of software up to date to take advantage of the current mainstream technology. If your development team will be expected to produce 10 applications (or distinct major releases of applications) over the next five years, can you afford or justify keeping all previous versions of the applications updated while you're also supposed to be implementing new functionality on new products? If you let projects rot on the vine with previous builds of the framework, you must realize that clients are going to expect to modify and enhance those applications for years to come.

There are a number of arguments I've heard in favor of building frameworks such as "code re-use". You do not need a framework to re-use code. Libraries and components are code re-use. "You can re-configure a framework or plug into it to introduce future functionality". Sure, but that requires investing today what you may not, and likely will not need tomorrow. And how much extra effort is going to be spent when it turns out what you guessed yesterday was wrong, and need to re-work or work around the framework to implement today's functionality?

Sometimes, it's just that development teams have stars in their eyes; a dream that they can write the ultimate system today that tomorrow and for years to come they can sit back smugly as new functional requests come in where they can just tweak a couple settings, and write a simple module. (They dream of writing themselves out of a job. :) But there's the kicker, "years". It will take years, many years before any framework constructed by an application team returns it's initial investment of tens to hundreds of thousands of lines of code, specific technologies and back-ends all perfectly aligned to meet today's vision. Ask any developer today whether they want to be working with .Net 1.1? Five years from now, are they going to be "happy" maintaining a .Net 3.5 framework? Or were they planning on upgrading their framework, and every application on it to date to .Net 4, .Net 5, and .Net 6 over the next five to ten years?

A framework is like building a closet organizer. If you build it for only what you need today, it would likely be a waste of space, and expensive/messy to tear out and replace when your needs change. So you plan for the future and build it to be flexible, to accommodate every thinkable scenario... Until you realize you didn't think of a place for your ties, so you're left stapling them to the wall.

A framework built in .Net 1.1 such as DNN justifies it's existence by enticing 1000 projects to use it. It evolves into .Net 2.0 and beyond on the merits and feedback from previous versions, hopefully to be used on a further 10,000 or more projects. When your goal today is to write a handful of applications, and you get the idea of writing that mondo-cool framework: Think of your sister.

Monday, February 1, 2010

Unit Testing using IoC Principles and Mocking/Stubbing

Overview

This document is geared towards providing an outline to producing effective unit tests for functionality using inversion of control principles and mocking frameworks.

There are a few different, and sometimes conflicting definitions of what is a mock vs. what is a stub. Generally a mock performs validation about if/how a mocked method is called, where a stub is merely a placeholder to return a canned result to a message. For the sake of simplification, this document will use the term mock to represent both mocks and stubs. A mocking framework can easily accommodate both mocks and stubs.

Inversion of Control and Testing
One of the key benefits of adopting Inversion of Control (IoC) is the enablement of testing classes in isolation. Take for example a scenario where we have a class responsible for providing functionality around domain objects. There is an additional class that provides validation of business rules such as permissions, and another class that manages interactions with persistence.

We could write this scenario similar to below:

It’s pretty straight forward where there is a good separation of concerns between objects. But how do we test the LocationProvider or LocationValidator class? LocationProvider is dependant on LocationValidator and DataContainer. LocationValidator is dependant on DataContainer. Assuming that data container has some method of being configured with a connection string, all we can hope to do is set up a test case to ensure that DataContainer points at a test database and the test setup should probably be inserting records into that database to ensure that the expected data is present.

Another important consideration from above is that any test we would write for LocationProvider would now be dependant on the behaviour of LocationValidator. This problem is compounded as the number and depth of dependencies increases. Our test case for LocationProvider would need to set up data in order to ensure that the Validator passes, or fails as expected. What happens when the Validator rules (behaviour) change? Our LocationProvider unit tests, and any tests that are dependant on the LocationValidator or the LocationProvider will also fail because they would need to be updated to set up data according to the new rules.

This scenario may seem extreme, but it is the primary cause behind the frustration and eventual abandoning of unit testing. Tests become too complex to set-up, and far too brittle to be written early in the development cycle. It may not be as extreme as this where data is the central contention point for unit tests. Developers experience and then fear situations where changes to one class start breaking unit tests all over the project. This creates “noise” in the test suite that takes time to understand and fix all of the affected tests.

It certainly doesn’t need to be like this.

Using IoC to De-couple Dependencies
Inversion of control provides a mechanism to de-couple dependencies from one another, allowing each dependency to be tested independently. IoC can be implemented a number of ways, such as providing dependencies in the constructor, setting dependencies via properties, or passing necessary dependencies as parameters within the method calls themselves. You could define dependencies as concrete instances, but for the interest of test-ability the dependencies will be defined as interfaces so that they can be substituted. The following example changes the scenario to provide dependencies via the constructor.

The code around the functionality looks pretty much the same as the original example. The key difference is that the instances of the dependencies are now provided when each class is constructed. Now we have something we can test.

Testing in Isolation with Mocks
Lets first look at writing a test for our Validation class using Moq. (2.6) *Note: for newer versions of Moq (3.0+) substitute the “Expect” method name with “Setup”.

In this example, we’ve created a stub object for the location (1), and another for the DataContainer dependency.(2) The data container is configured to expect a meth call to “CountObjects” and is instructed to return a non-zero value. (3) We create the validator passing it in our stubbed data container. (4). The test then asserts than the validator returns back “False” based on the results.

By definition, the data container substitute that we’ve created is a stub, not a mock. The reason is that this object doesn’t validate that it was actually called. This would be better served by a true mock instance, which can be done by modifying the test as follows:

Converting this to a true mock by adding the verification is important because it ensures that the CanDelete method actually does call the data container. (The business logic says it should.) The mocking framework would automatically detect if a different method was called, or different values for parameters, but it doesn’t report back whether a call actually occurred unless we ask it.

A note regarding the naming convention for the test: This example unit test is named to reflect the behaviour, or business requirement of the code being tested. This is in the spirit of Behaviour Driven Development. (BDD) By testing around the behaviour, the unit test itself becomes part of the design for the given system. It defines a required behaviour, and as an integrated part of the project, anyone seeking to change to the requirement (test) or implemented behaviour (code) will have to ensure that the two reflect one another.

The next obvious test scenario is that CanDelete returns True if user locations are not found.

More on Mocking Frameworks

Mocks and stubs can be hand-rolled by creating test classes that implement the various interfaces. Mocking frameworks like Moq are invaluable because they hide the messy extras of implementing interfaces where you’d need to include signatures for every method in the interface. With mocking frameworks you only implement the interface methods you want to use. The IDataContainer may eventually have 20 or 30 methods in it but with the mocking framework your tests can safely be written around only the methods you care about. If you hand-roll a stub class, as you extend the interface you’ll need to continuously go back to the stubs to add place-holder methods to match the interface.

There are a number of mocking frameworks available for .Net including Rhino Mocks, Moq, NMock, and EasyMock.Net. Rhino Mocks is arguably one of the oldest and feature-rich. Moq is a “back to basics” mocking framework written around .Net 3.0 offering an elegant, simple tool for defining stubs and mocks. My argument for going with the basic and elegant option is that unit testing should strive to be as flexible and non-intrusive as possible. If tests adopt complex mocking behaviour you end up spending a lot of time trying to build tests to assert what really should be basic concepts. If you find you need to mock out complex scenarios then you either need to take another look at how you’re structuring your dependencies, or you’re testing far too deeply into dependency layers.

Testing Layered Dependencies
The real power of using IoC coupled with mocking presents itself in scenarios where dependencies are layered. Lets look at the case of testing the LocationProvider class. This class had a method to perform a delete that would ask the validator then instruct the data provider to perform an action if the validator gave the go-ahead. We already have unit test coverage for the validator to ensure that it does the right thing with inputs and outputs, so as far as tests for the location provider go, the validator can be mocked.

We create our stub object again (1) and a mock validator. (2) We instruct the validator to expect a call to CanDelete with our stubbed location, and to return True giving the go-ahead to delete. (3) We instruct the data container mock to expect a call to DeleteObject, provided with our stubbed location. (4) Note that these calls will fail if either call is made with anything other than the reference to our stubbed location. We create an instance of our provider class providing the two mock dependencies, and call the Delete method with the stubbed location. (5) The final step is to verify that our mock methods were actually called.

Notice that we don’t need to add any complexity around trying to ensure what the validator does, where the data container would be expecting count requests and such. That behaviour is already tested. All we care about is that a delete goes ahead when the validator is satisfied.

The next scenario to test is that a delete *doesn’t* occur if the validator reports back a no-go condition.

The data provider won’t be called if the validator returns false so that obviously won’t pass if we try to verify the call, so we removed that line and the verify. But wait, now this test doesn’t actually test anything other than that the call to the validator occurs. This is ineffective. What we really want to do is verify that the data provider *doesn’t* get called in this scenario:

To accomplish this, we put the .Expect call back in. But instead of doing a verify against it, we instruct it to throw an exception. This will cause the test to fail should the data container be instructed to delete an object even if the validator said no-go.

This type of enforcement with mocks is invaluable to help detect and prevent aggravating bugs from working their way into the system undetected until late in the development cycle, or worse, when the product makes it into production.

Lets look at a simple example of how this kind of thing could happen.

Cowboy Coding
Meet Cowboy Cody. Cody’s a great coder, quick as a whip. Cody starts off running with his new enhancement, to add a condition to change some rules around deleting locations. He looks at the existing code and gets as far as the Location Provider spotting the method he wants to change. “What’s this validator thing? Aw, hell, that kind of stuff isn’t necessary, all I want to do is add this one little condition. I don’t want to go through the trouble of adding a parameter so it’s going to be as easy as adding a module-level variable. Haha, 5 minutes and I’m done!”

Lets see what damage Cody’s done…

At a glance it looks harmless enough. Cody was at least good enough to go into the application and try out some of the conditions around his new 5-minute flag. He’s checked his code in.

Prudent Pat, the team lead, sees that changes have been made to the project and he immediately runs the test suite before starting his own enhancements. Red Light! “Ensure Delete Doesn’t Occur If Validation Rules Are Not Satisfied” has failed, DeleteObject has been called!

It turned out that the default for _someOtherCondition was “true”, and Cody has now introduced a behaviour change bug where his added condition can override the other validation rules. Pat consults Cody and Cody admits “Whups, that should have been an ‘&&’”

The next morning, Cody’s assigned the task of “fixing” his change properly, in the right place, and correctly ensuring the unit tests reflect the behaviour.

Conclusion
Hopefully this is a useful guide for how to effectively apply IoC principles in combination with a mocking framework to write unit tests. Writing unit tests can often be a bit of a daunting task but gradually you will find the quality of the tests you write goes up, as the time spent writing them and managing breaking scenarios goes down.

One Final Mention
Unit testing is nothing that should be dramatically increasing the amount of time you spend writing code. Every developer factors in time to start up a project as they’re working and try out their changes as they are working on them. Adopting unit tests is taking the majority of that time investment and using it to create tests that are repeatable. The initial investment is larger than the time you might spend trying out a change, but hopefully you aren’t in the habit of only trying things out when you think you’re “done”. The cost of running existing unit tests to ensure you haven’t accidentally broken anything is a lot less than manually running through existing functionality as you introduce new functionality.