Py's Sty: 2010

Thursday, December 9, 2010

Ouch, 3-day bug!

Earlier I had posted on how to write tests for the Quantumbits ObservableList class by enabling a DispatcherCycler to essentially perform a "DoEvents" to hook up appropriate events.

This was all working just fine until one day after adding a more complete suite of tests for an implementation of an ObservableList I had a failing test. I had run the test successfully before so I took a quick look and re-run the test. It passed. I ran the suite again and it failed. I ran it several times as a suite and individually to see if it might be an intermittent problem but alone it always succeeded, as part of the suite it always failed. And it failed for a completely wrong reason.
        [Test]
        public void EnsureChangingRelativeDeltaUpdatesRemainingCorrectly()
        {
            var mortgageInterestList = new MortgageInterestList(new MortgageInterest.Initial(0.05m, DateTime.Today));
            var second = new MortgageInterest.Adjustment("+.5%", DateTime.Today.AddDays(2));
            var third = new MortgageInterest.Adjustment("+.2%", DateTime.Today.AddDays(3));
            mortgageInterestList.Add(second);
            mortgageInterestList.Add(third);
            new DispatcherCycler(mortgageInterestList.CurrentDispatcher).Cycle(); // * Failed here

            second.UpdateByAbsoluteOrRelativeValue("+.3%");
            new DispatcherCycler(mortgageInterestList.CurrentDispatcher).Cycle();
            Assert.AreEqual(0.05m, mortgageInterestList.SortedList.First().InterestRate, "First interest rate should not have changed.");
            Assert.AreEqual(0.053m, second.InterestRate, "Second interest rate was not updated.");
            Assert.AreEqual(0.055m, third.InterestRate, "Third interest rate was not updated.");
        }

The guts of the list in question contained a set of data that would be sorted in date-order. It contained an initial instance (which could have it's date adjusted) and had logic that basically asserted that "adjustment" instances could not be inserted before the initial item, or moved (date changed) prior to the initial item. The error behind the failing test indicated that the test was trying to move an item prior to the initial item, which it obviously wasn't. The test merely changes the value of one of the items which should update the later item. Debugging that failure it almost looked like the adjustments were being added to the collection before the initial item, which was completely impossible.

After staring this down for two evenings looking for bugs with my Dispatcher implementations within the Observable List and cycler, Dispatcher behaviour and multithreading, TDD.Net behaviour, etc. I decided to start looking for tests that were supposed to be replicating this behaviour. Sure enough I did have tests for this behaviour:
        [Test]
        [ExpectedException(typeof(ArgumentException))]
        public void EnsureAdjustmentInterestRateCannotBeMovedBeforeInitialRate()
        {
            var mortgageInterestList = new MortgageInterestList(new MortgageInterest.Initial(1.25m, DateTime.Today.AddDays(1)));
            mortgageInterestList.Add(new MortgageInterest.Adjustment("+5%", DateTime.Today.AddDays(2)));
            new DispatcherCycler(mortgageInterestList.CurrentDispatcher).Cycle();
            mortgageInterestList.SortedList[1].EffectiveDate = DateTime.Today;
        }

I popped an [Ignore] attribute onto this test and re-ran the suite... All remaining tests pass! I re-enable the test and the other test fails once more. Now I'm seriously doing a WTF? but making progress.

Side Note: Notice now my Dispatcher cycler is accepting a dispatcher in it's constructer, rather than being a static helper using Dispatcher.CurrentDispatcher. I was chasing all kinds of ideas around whether Dispatcher actually was thread-safe, and whether I needed to ensure the Cycle call would use the same dispatcher as the list. I went through a LOT of trial and error to try and track this behaviour down...

Then I read up on Dispatcher and how it can capture unhandled exceptions. Since this test was waiting for an exception that would have been raised on the Dispatcher (UI) thread based on the nature of ObservableList I decided to see what happens if I handled the exception at the Dispatcher level. I added a text fixture setup implementation to handle unhandled exceptions:

            Dispatcher.CurrentDispatcher.UnhandledException += (sender, e) =>
                {
                    e.Handled = true;
                };
Re-running the tests I had different failures. Essentially the classes that were expecting the exception weren't receiving it... But the originally failing test was passing.

Finally I removed the fixture setup and tried the following:

        [Test]
       [ExpectedException(typeof(ArgumentException))]
       public void EnsureAdjustmentInterestRateCannotBeMovedBeforeInitialRate()
       {
           var mortgageInterestList = new MortgageInterestList(new MortgageInterest.Initial(1.25m, DateTime.Today.AddDays(1)));
           mortgageInterestList.Add(new MortgageInterest.Adjustment("+5%", DateTime.Today.AddDays(2)));
           new DispatcherCycler(mortgageInterestList.CurrentDispatcher).Cycle();
           DispatcherUnhandledExceptionEventHandler handler = (sender, e) =>
           {
               e.Handled = true;
           };
           mortgageInterestList.CurrentDispatcher.UnhandledException += handler;
           mortgageInterestList.SortedList[1].EffectiveDate = DateTime.Today;
           mortgageInterestList.CurrentDispatcher.UnhandledException -= handler;
       }

Sure enough all tests PASS! I was actually expecting the above test to fail since the dispatcher would not raise the ArgumentException that would be raised when the effective date was changed, such as was the case when I had this essentially in the Fixture setup. However, the ArgumentEception does get raised and handled.

I explicitly add and remove the delegate handler since this would be attaching a handler to a CurrentDispatcher on the executing thread. I don't want it to interfere beyond the scope of this test.

Actually, as I write this I want to be certain that other tests that might trigger an exception (as a test failure) or test the exception behaviour don't trip up other tests.

The code can be substituted so that the handler is assigned on test start up, and removed on test cleanup. Fortunately I have a base class wrapper for my nUnit implementation so this is a snap!

        private DispatcherUnhandledExceptionEventHandler _handler = (sender, e) =>
        {
            e.Handled = true;
        };

        public override void TestSetup()
        {
            base.TestSetup();
            Dispatcher.CurrentDispatcher.UnhandledException += _handler;
        }

        public override void TestTearDown()
        {
            base.TestTearDown();
            Dispatcher.CurrentDispatcher.UnhandledException -= _handler;
        }

And now all tests are covered to behave nicely with one another if the Dispatcher encounters an error. No messy hooks in individual test cases needed.

Damn that felt good to squish!

Wednesday, November 24, 2010

Contract vs. Permanent

This is an odd one that I come across from time to time, and again just recently.
I work as a contractor for a client and they've expressed an interest in ensuring stability within their development team, so they had asked if I would be interested in taking a permanent position. I politely declined since the benefits of a permanent position don't coincide with my own objectives. I can definitely understand the desire to have some stability within a development team, and since I don't have any serious problem with how things are arranged, they have my assurances that I will be available essentially for as long as they need me.

The odd thing is that in talking with permanent staff, the impression of a contractor is that they're someone that can just vanish tomorrow; that being a "permanent employee" somehow indicates that the company has an investment in you, and you have an investment in them. However, when you look at the facts this doesn't hold any water, especially in the I.T. industry. How does a company invest in you? "Permanent" implies that you will always have your job through hell and high water. However every permanent employee has the exact same termination clause in their contract as I have in mine. You can be "let go" or decide to leave with 4 weeks notice. You could argue that a permanent employee would have some additional protection in the form of unfair dismissal, but no more-so than I would if I were given a six month contract which the client tried to end after just two. In either case it will be an ugly business to dispute. Companies will often offer training to permanent employees. However, in the over 15 years I have been in software development in 10 organizations/clients (7 of which as a permanent employee) I have received training twice. (1x 1-day course on advanced VB, and a 2hr course on time management.) I have only rarely seen other permanent employees actually get sent off for training or seminars. Depending on your client, a contractor can be treated to other benefits that are given to permanent staff such as inclusion in company dinners, and gifts of appreciation.

Contracting also has a bit of a myth associated with it that often rubs permanent staff the wrong way: Contractors get paid more. False, with a grain of truth. Being a contractor, especially on long-term contracts (>6 mo.) means you are exchanging most of your entitled benefits for cash. Annual leave, sick leave, paid holidays, etc. This means if you work for a full year, not taking any sick days off, as a permanent employee you are entitled to an extra 20 days pay. If you don't use it, those days are paid out when you leave and taxed at the maximum tax bracket. You won't get that extra tax back until the end of the tax year. (So in Australia, be sure to quit that permanent position in June.) In the I.T. industry and especially in smaller organizations, paid leave is rarely "convenient" to take. You're always busy, and you know that any leave you take is going to be a significant burden on the company, your co-workers, and extra urgent work to worry about when you get back. Some permanents I've worked with try to plan their vacations after the next major release, only to find that the schedule slips and that leave they've booked for months in advance now falls smack in the middle of the production roll-out. Permanent positions are kind of like an insurance policy. If you're sick or want to take some time off, your income flow stays stable. But if you don't use that, well, you might be better off with the money up front. Short term contracts will result in marginally higher rates simply because a contractor must factor in the time they need to take in-between short-term contracts in order to secure more work.

The reason I contract as a sole trader is to maximize my income to minimize my debts. There are a number of factors that I considered to reach this:
1) I have a sizable mortgage with an offset account. Offset interest savings are 100% tax free, and are valued directly against the current interest rate. (Where any other investment earnings contribute to and are taxed at your overall tax rate and Super contributions are taxed at 30% with super earnings taxed at 15%)
2) As a sole trader I invoice my client directly and weekly, collect GST and and pay it back quarterly, pay income tax annually, and do not contribute to Superannuation. (And soon I will be contributing to it again annually.)
This maximizes the amount of money I keep in my offset account working to reduce the interest my mortgage is adding up on a daily basis.
3) I'm rarely if ever sick, so sick leave isn't any value to me. (Since it's not paid out if unused, and I'm generally too ethical to "chuck a sickie".)

Because of this, contracting under this arrangement will help pay off my mortgage many years faster than if I was a permanent employee.

Thursday, November 11, 2010

Confused the compiler/runtime or what?

This was an innocent enough of a typo but it had me questioning whether the framework's string.Split() method was working with the remove empty entries option.

Line in question:
var parameterPairs = parameterValue.Split(new char['&'], StringSplitOptions.RemoveEmptyEntries).ToList();

The parameterValue being parsed was "&Token=8765"
I expected to get a result back of "Token=8765", but instead got back a value of "&Token=8765".

I was thinking was there a problem with RemoveEmptyEntries when the separator was the first character? Was there a problem with '&' as a separator? Both checked out. Then I noticed the typo. It should have been:

var parameterPairs = parameterValue.Split(new char[] {'&'}, StringSplitOptions.RemoveEmptyEntries).ToList();

And presto, the test passed. But that got me wondering, what was char['&'] resolving to? Why did the compiler allow it to compile, and why did string.Split() seem to merely ignore it at runtime?

I put in a precursor condition:

var test = new char['&'];

along with a breakpoint.

Curious...

Tuesday, November 2, 2010

Beware! The clipboard may lead you to question your sanity.

This one had me scratching my head for a good while. I was able to fix it easily enough, but getting to the bottom of just what the hell happened took a bit of digging.

The problem started because I wanted to switch a test against a certificate located on the disk to using a certificate located in a certificate store. For this test I decided to pull the certificate by serial number so I snatched the serial number from the certificate properties easily enough.

I tried this as-is and it wasn't working. So I tried removing the spaces, still no joy. So I grabbed the entire list of certificates on the machine and found the certificate to verify the serial #. The results showed the serial number in upper case, so I quickly popped a .ToUpper() on my key. NO-JOY. Ok, WTF. So I copy the serial number out of the watch results in the IDE and try again, it works.

But I couldn't leave it at that. What was wrong with my original value? I restored the changes and created a second set of checks (See first screen) and started cursing my monitor in disbelief. I lined them up, reversed the order they were called in the Find() statement, and cleaned and rebuilt the test application several times. String.Compare returns a match, .Equals returns not equals. The call to the store to Find returns a result for only the one value, regardless of which order they are executed.

The only possibility: copy and pasting the original value from the certificate dialog resulted in some unprinted Unicode character being embedded in the original string. (serial2) String.Compare ignores it, but .Equals does not. (The test1/test2/result check was just me testing my sanity that .Equals on string wasn't doing a reference equals.) To quote Sir Arthur Conan Doyle: "Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth."

Monday, November 1, 2010

Keeping up is fun.

Software development is constantly evolving. Years ago there would be years between product releases, and while there were libraries popping up every so often, there were typically a lot fewer to follow for any given project you might be working on. The pace of change was generally slower.

Today, change is constant and occurring at a significantly faster pace.On top of that, the open-source movement has meant a lot more releases of a lot more useful tools to incorporate within a project. It's easy to get left behind, or running after new versions without a chance to stop and smell the roses.

A perfect example. nUnit is a brilliant unit test framework port for .Net. I've used it for many years, but I've never really given each new version more than a passing glance. Unitl I caught a mention of a new attribute called "[TestCase]". (As opposed to the "[Test]" which denotes a test case.)

Take a case where you want to test a range of outputs such as a minimum, maximum, and assortment of values to ensure a method is working properly. This might be 4 tests or more, consisting of setup, call, and assert. For example I had 3 tests that verified formatting behaviour for a currency converter. Positive value, negative value, and zero. Scenarios like this can now be replaced by something like this:

        [TestCase(1000d, Result = "$1,000.00")]
        [TestCase(-1000d, Result = "-$1,000.00")]
        [TestCase(0d, Result = "$0.00")]
        public string EnsureConverterFormatsCurrencyValues(decimal value)
        {
            IValueConverter testConverter = new DecimalToCurrencyConverter();
            return (string)testConverter.Convert((decimal)value, typeof(decimal), null, CultureInfo.CurrentUICulture);
        }
*Sigh*... Those roses sure smell nice when you finally get a chance. Not only is it one test case (executed and reported 3 times) but it doesn't even need the Assert.Equals line. It would have been nice to have spotted this a year ago when they added it. :)

*Note: The above declaration is actual code, not a typo. The parameters on the attribute are provided as doubles (d) not decimal (m) even though the parameter type is decimal. Decimals cannot be provided via attributes (a limitation in .Net) so TestCase facilitates this translation.

Thursday, September 23, 2010

Amber, Green, Re-factor

Red, Green, Re-factor (Red-Green for simplicity in this entry) is the mantra of Test-driven development. (TDD) The goal being to satisfy a requirement you write a test that fails, then write the functionality to satisfy solely that test, then re-factor as you implement further functionality. There's definitely nothing wrong with this approach, but I don't think it's for everybody. An alternative to TDD is Test-soon development. (TSD) In this model you write tests soon after you're satisfied that target requirements have been met. However, TSD is still Test-driven development, it just isn't Red-Green. TDD is test-driven because design decisions place an emphasism on meeting requirements with testable and, most importantly, tested code. This is obvious when you write the unit tests first, but is no-less true when writing unit tests second. The key is that the code is written with tests in mind. When you choose to write them is irrelevant, so long as you test it right.

TSD is less disciplined than TDD in that it is easier to defer testing as you continue to implement related functionality. This accrues up technical debt, leaving a larger pile of unit testing required. Tools like NCover are excellent at tracking your technical debt bill, so if you're using TSD, NCover is an absolute must. Another negative point against TSD to the TDD purist is that unit tests against working code would be written to pass, so they would be less reliable since they may not actually fail when they need to. This is a significant concern, but I think it applies just as much to Red-Green TDD as it does to TSD. In Red-Green you might only write tests that satisfy the "chosen path" but not exercise the code properly when paths cross. Either way you need to really think about behaviour to test, not just satisfy the obvious aspect of a requirement.

This second concern of writing tests to fail really got me motivated to ensure tests were exercised fully along-side the code they intended to test. My motto of choice to this effect is Amber, Green, Re-factor or Amber-Green. Essentially I write code to satisfy a requirement, then I create tests, and excercise test & code (Amber) until I'm satisfied they pass (Green) then the code is ready to be re-factored as necessary for future functionality. The reason I prefer this to TDD is that I've invested the time into satisfying architectural decisions that often inter-twine with satisfying requirements. I'm satisfied not "that it works", but rather "that it's going to work". Now the focus shifts to developing the tests to ensure that it truly works. My goal with writing the unit tests is to break the code. I know for a fact that I didn't think of everything so it's time to find what I missed by probing the code with tests. When I write a test that passes first run I alter the code and make sure the test fails as it's supposed to. When I write tests that don't fail then I become concerned that I'm not exercising it fully. Even if I think of something I missed, I write the test, watch it fail, then fix it.

Ultimately the approach isn't much different to Red-Green. With Red-Green it pays to think up-front while fleshing out boundaries and behaviours for code before you write it. Personally I find it's easier to flesh this out once I know what I'm planning to take to production, and switching my mentality to "break it!" helps to really excercise the code.

So the next time the TDD purist on your team scorns you for not writing tests first, rest assured you can still be test-driven with Amber-Green; Just keep a close eye on your technical debt and break that code good.

Friday, September 17, 2010

Unit Testing for the QuantumBits ObservableList

In using the ObservableList example from QuantumBits I came across a bit of a problem. This list ensures all event notifications happen on the UI thread by using the dispatcher. One change I added was to expose virtual methods in the ObservableList to "hook in" custom actions such as a flow-on effect for calculations as items in the list are changed. (OnItemAdded, OnItemRemoved, OnPropertyChanged etc. )I have extended functionality that interacts with these events to run calculations and I wanted to unit test this behaviour.

An example test:

[Test]
public void EnsureThatRelativeRateIsUpdatedWhenEarlierInterestRateChanges()
{
  var mortgageInterestList = new MortgageInterestList();
  var first = new MortgageInterest.Absolute(0.05m, DateTime.Today);
  var second = new MortgageInterest.Relative("1%", DateTime.Today.AddDays(2));
  mortgageInterestList.Add(first);
  mortgageInterestList.Add(second);

  first.UpdateInterestRate("+0.5%");
  Assert.AreEqual(0.055m, first.InterestRate, "Interest rate was not updated.");
  Assert.AreEqual(0.01m, second.InterestDelta, "Second's delta should not have been updated.");
  Assert.AreEqual(0.065m, second.InterestRate, "Second's interest rate should have been updated.");
}

An absolute interest rate is such as when a user says the interest rate was 5% as-of a specified date. A relative interest rate is one where the user specifies a +/- delta from any previous interest rate. The idea being that if an interest rate changes right before a relative rate, the relative's delta should be re-applied to adjust it's rate.

The problem is that while the MortgageInterestList (extending ObservableList) is wired up with the event to perform the calculation, running in a unit test the dispatcher doesn't actually run to set up the event hooks so that the edit triggers the flow-on effect.

The problem stems from this in the ObservableList:

public void Add(T item)
{
  this._list.Add(item);
  this._dispatcher.BeginInvoke(DispatcherPriority.Send,
    new AddCallback(AddFromDispatcherThread),
    item);
  OnItemAdded(item);
}

This method is perfectly correct, and in an application utilizing this functionality it works just great. However, unit tests don't like this kind of thing. The callback is added, but even with a Priority of "Send" (highest) the hookup doesn't actually happen during the test execution. The event doesn't fire, the rate isn't updated, the test fails.

After a lot of head scratching and digging around on the web I came across the solution on StackOverflow. Ironically while someone was having a similar problem to me with testing around Dispatcher-related tasks, this wasn't the accepted solution, but it worked a treat.
2nd Answer

public static class DispatcherUtil
{
  [SecurityPermissionAttribute(SecurityAction.Demand, Flags = SecurityPermissionFlag.UnmanagedCode)]
  public static void DoEvents()
  {
    DispatcherFrame frame = new DispatcherFrame();
    Dispatcher.CurrentDispatcher.BeginInvoke(DispatcherPriority.Background,
      new DispatcherOperationCallback(ExitFrame), frame);
    Dispatcher.PushFrame(frame);
  }

  private static object ExitFrame(object frame)
  {
    ((DispatcherFrame)frame).Continue = false;
    return null;
  }
}

Basically it's a mechanism that you can run inside a test to signal the dispatcher to go ahead and process anything currently pending. This way the events will be wired up before the test resumes. Damn simple and it works.

The test in this case changes to:

[Test]
public void EnsureThatRelativeRateIsUpdatedWhenEarlierInterestRateChanges()
{
   var mortgageInterestList = new MortgageInterestList();
   var first = new MortgageInterest.Absolute(0.05m, DateTime.Today);
   var second = new MortgageInterest.Relative("1%", DateTime.Today.AddDays(2));
   mortgageInterestList.Add(first);
   mortgageInterestList.Add(second);
   DispatcherUtil.DoEvents();

   first.UpdateInterestRate("+0.5%");
   Assert.AreEqual(0.055m, first.InterestRate, "Interest rate delta change was not based on the previous interest rate.");
   Assert.AreEqual(0.01m, second.InterestDelta, "Second's delta should not have been updated.");
   Assert.AreEqual(0.065m, second.InterestRate, "Second's interest rate should have been updated.");
}

SO if "jbe" should ever read this, thanks a bundle for the solution!

I've never been a fan of DoEvents coding in VB which this is pretty closely mirroring so this little gem will be located in my Unit Testing library where I have my base classes for uniform test fixture behaviour.

*Edit* I didn't much like the name of the class when I incorporated it into my test library. The updated signature is:

public static class DispatcherCycler
{
  [SecurityPermissionAttribute(SecurityAction.Demand, Flags = SecurityPermissionFlag.UnmanagedCode)]
  public static void Cycle()
  {
    DispatcherFrame frame = new DispatcherFrame();
    Dispatcher.CurrentDispatcher.BeginInvoke(DispatcherPriority.Background,
    new DispatcherOperationCallback((f) =>
      {
        ((DispatcherFrame)f).Continue = false;
        return null;
      })
      , frame);
    Dispatcher.PushFrame(frame);
  }
}

Wednesday, September 15, 2010

Linq2SQL Cache & Connectivity

One "gotcha" you will need to consider when using Linq2SQL is how to handle situations where a connection is lost at a moment when a SubmitChanges call is made. (Basically the period of time between when the last object reference is retrieved from the cache, and SubmitChanges is called.)

The SubmitChanges call will fail, expressing what is wrong, however the cached copies of the data will hold onto their changes until they are committed, or refreshed. This can have some truly confusing side-effects that can be a lot of fun to track down.

Case in point: I recently had to track down one of these issues in a simple barcode stock application. It basically consisted of orders and boxes. As boxes of stock come in, they are scanned with a barcode reader, looked up in the DB using Linq2SQL, have their statuses validated, and updated. As each box is scanned in, the order is also loaded to inspect whether or not all of its boxes have been scanned into stock, and when they all are, the order is updated to "complete". The bug was that every so often (once every several weeks or more) we got a situation where an order was found to be complete while one of its boxes was not recorded in the DB as "in stock". We eliminated every possibility, scoured to code for status change bugs, but there was nothing. Scanning exceptions were already brought up to the screen, but operators typically dealt with several scan issues (re-scanning boxes after being interrupted, etc.) so they didn't notice anything odd. After extending the logging support to see what was going in I found that the box was scanned, an exception occurred, but when re-scanned it said it was in-stock, and after all remaining boxes on the order were scanned, the order found all boxes to be in-stock, and diligently closed itself off.

The issue was Linq2SQL's caching behaviour. The exception was occurring when the box's change was submitted to the database. Any problem with the connection to the server (This was a client running in Victoria hitting a DB in Queensland) and the SubmitChanges would fail, but the updated objects remained in the cache. Since the same data context was used for all boxes scanned in during a receive stock session, further requests for that box retrieved the cached copy.

This is a bad scenario to be in. Your options are either to retry the submit changes until it either goes through, or you try to revert the changes until they revert, or you force the application to shut down if neither of those work. I was reporting back to the user that there was an error, and IMO if there is a problem then the user should retry what they were doing, I.e. re-scan the box, so I wanted to revert the changes. If the data source still wasn't available then the user should shut down the application and contact support.

The first solution that came to mind was to catch the exception from SubmitChanges and refresh the affected object graph. Since the box had child entities that were also updated, plus an order line item, this entire graph would need to be refreshed. Easy enough, as objects are queried and updated, they got added to a revertList should the SubmitChanges fail:

try
{
    DataContext.SubmitChanges();
}
catch
{ 
    DataContext.Refresh(RefreshMode.OverwriteCurrentValues, revertList);
    throw;
}

Unfortunately this doesn't work as the Refresh method will fail with:
System.InvalidOperationException: The operation cannot be performed during a call to SubmitChanges.
at System.Data.Linq.DataContext.CheckNotInSubmitChanges()
at System.Data.Linq.DataContext.Refresh(RefreshMode mode, IEnumerable entities) ...

So I needed to trigger the Refresh call outside of the SubmitChanges scope. The first thing to note is that while the system is processing, the scanner is turned off until the processing is completed. For minor errors such as scanning a box twice, etc. the scanner beeps out, displays a message, and reactivates. For serious application errors the scanner remains off until the application determines the exception can be corrected. The user knows that if the scanner beeps out and turns off, check the screen, and try restarting the application if it doesn't turn on within a minute or so.

In any case, the solution was to use an asynchronous operation to handle the scanner error and data refresh. This meant worker thread and locking off the scanner access until the refresh was successful, or timed itself out trying.

try
{
  DataContext.SubmitChanges();
}
catch
{ 
  revertDomainObjects(scanner, revertList);
  throw;
}

/// <summary>
/// This method will attempt to revert changes to objects to reflect the data
/// state. This will be called in situations where the data connectivity has 
/// been interrupted, where data state does not reflect cached state. This will
/// attempt to refresh cached copies of objects to match data state.
/// </summary>
/// <param name="revertList">List of domain objects to revert back to data state.</param>
private void revertDomainObjects(IScannerHandler scanner, List<object> revertList)
{
  ThreadPool.QueueUserWorkItem((e) => 
  {
    lock (_dataContext)
    { 
      var inscopeScanner = ((List<object>)e)[0] as IScannerHandler;
      var inscopeRevertlist = ((List<object>)e)[1] as List<object>;
      int attemptCounter = 0;
      bool isSuccessful = false;
      while (!isSuccessful && attemptCounter <= 4)
      {
        try
        {
          _dataContext.Refresh(RefreshMode.OverwriteCurrentValues, inscopeRevertlist);
          isSuccessful = true;
          inscopeScanner.RecoveredFromCriticalException();
          scanner.Status = Objects.ScannerStatus.Idle;
          scanner.StatusMessage = "Scanner has recovered from the previous error. Please retry operation and contact Licensys I.T. if the issue continues.";
        }
        catch
        { // If the rollback fails, sleep for half a second and retry up to 5 times.
          Thread.Sleep(500);
          if (attemptCounter++ > 4)
          throw;
        }
      }
    }
  }, 
  new List<object>() {scanner, revertList});
}

Probably a better solution would be to use a Task<> implementation if it's available in .Net 3.5, but this does the job and I don't need to be able to interrupt it.

Now in the case of an exception during SubmitChanges, the scanner will alert the user and be de-activated (triggered when the exception bubbles)and the worker process will kick off to try and refresh the affected data. if that is successful the scanner will be re-activated and the user notified that they can continue.

Caveats: This did require adding some thread-safety to an otherwise single-threaded application. I added an accessor property that attempted to lock the data context before returning it. Not ideal but in this case the means of triggering calls to the data context (scanner) was disabled until the operation was completed. This will be reviewed and refactored shortly as we move to support multiple simultaneous scanners.

Friday, July 30, 2010

Rolling in the mud.

Every idea starts its life as a shiny new marble. Unfortunately the world around us is a pretty muddy place, and when that idea gets pushed around in the real world, it soon starts collecting mud.

Software development has had countless shiny marbles since I started in the industry, and it's pretty interesting watching as they all slowly succumb to the mud. Development languages, best practices, design patterns, frameworks, etc. Each virtually glows with merits and virtues, flinging mud on its predecessors. Everyone wants a turn at pushing around the new shiny marble, but pretty soon it too is coated in mud.

A most recent example: MVVM (Model View View-Model) This is a relatively new design pattern stemming from the MVC (Model View Controller) pattern ancestry. Essentially you have a history that we'll start off at MVC which is a great pattern for earlier, relatively state-expensive terminal applicationa, and is very well suited to stateless web applications. However, for environments where maintaining state was cheap, and users wanted GUI-rich applications, a new shiny variation was born, MVP. (Model View Presenter) It's essentially the same thing, just with some responsibilities shifted between layers. A good explanation can be found here.

Microsoft chased MVP for quite a while with their WinForms and later, WebForms implementations. WebForms was a bit of a disaster because of the cost of maintaining state over the wire.

Martin Fowler coined a new design pattern called Presenter Model which was essentially a MVP pattern where the Presenter was converted into a representation of the model for the view. A subtle variation of this was later given the title of MVVM, and the next shiny marble began rolling around. Microsoft re-defined the way they represent views in both applications and web applications to utilize XAML, pushing aside the older WinForm and WebForm constructs in favour of WPF and Silverlight. These were designed to gel with this new MVVM pattern.

The part I find odd is the seemingly outright vilification of the code-behind aspect still present in XAML, similar to the construct used in WinForms and WebForms. Back in the day these were shiny features, now they're shunned, and to be replaced at all cost with new shiny features.

But it's kind of funny, because in flinging the mud on code-behind, it seems that MVVM is starting to roll deeper into the mud itself. MVP would rely somewhat on code-behind. In a clean example, the code-behind would essentially implement events from the view definition (interface) and upon receiving an event from a UI element such as a button, raise an appropriate view event to the presenter. You might also have some simple method declarations that the presenter can call into the view to perform actions. (Such as clearing out all fields, etc.) The view is essentially anemic. But, there is code-behind.. POO-POO!

In MVVM-land, your view is bound to a View-Model, and your View-Model can define Commands that can bound to things like buttons, so that clicking a button will automatically execute a command. Ok, I can see the merit in this, but wait, it's not that simple, it's got to be shinier. The Command accepts delegates so you can define what the command does.... back in the View-Model.... um, ok... This is starting to smell. Great, so we can put command logic in the View-Model, that is delegated through another object, just so we can avoid declaring an event in code-behind.

Granted, there is some merit in this approach. This does mean that a "designer" can tweak behaviour in an application merely by editing XAML, they don't need to see C#/VB code-behind files. But, they still need to know what these commands are so they can wire them up in a view. In the end, the code is more complicated, with Func<> declarations scattered around and Lambdas, etc. all to remove a couple of simple events.
Now I find this whole argument of defining more stuff in XAML a pretty weak one. How many companies actually advertise for and hire "designers"? More importantly, is it such a huge ask to have a developer work with one of these designers to wire up UI changes, in the event that a designer with a likely web background that has to be able to breathe Javascript, couldn't figure out how to wire an event in C#?

MVVM is starting to get ripe with issues where those wonderful savings that it offered over MVP and code-behind. Setting focus after validation, and dealing with message boxes are just a couple that are popping up. It's not that these issues don't have solutions, but these solutions are typically either hacks around limitations, or fairly complex solutions for what should be simple issues that have been done to death in earlier patterns.

MVVM is getting muddy. I wonder when the next shiny marble will appear.

Friday, June 4, 2010

SQL Count by criteria.

This one had me scratching my head for a while. I had a tiered query grouped by a number of fields in the table structure. I wanted to get a count of all applicable items matching a set of given statuses.

For example: Toys
Type
Manufacturer # Sold # Returned

Cars
--Vrroom 14 0
--Bigga 16 2
Dolls
--Munchkin 22 4
--Slimer 7 1

etc.

Originally I had a query along the lines of:
SELECT i.InventoryType,
p.ProductName,
m.ManufacturerName,
CASE t.TransactionType
WHEN 1
THEN 'Sold'
WHEN 2
THEN 'Returned'
ELSE 'N/A'
END AS TransactionStatus,
COUNT(t.TransactionID) AS TransactionCount
FROM Inventory i
INNER JOIN Product p ON i.InventoryID = p.InventoryID
INNER JOIN Manufacturer m ON p.ManufacturerID = m.ManufacturerID
INNER JOIN Transaction t ON p.ProductID = t.ProductID
WHERE .... -- Conditions...

(Now this is only a rough approximation of the issue I was facing)
I then grouped within the report and used a conditional field on the transaction status to count applicable items. It sort of worked, but produced 2 lines for the items that had both sales and returns:

Cars
--Vrroom 14 0
--Bigga 16 0
-- 0 2
Dolls
--Munchkin 22 0
-- 0 4
--Slimer 7 0
-- 0 1

At first this looked like an issue within the SSRS report where it was grouping by Details. I removed that grouping which at first looked like it was working (1 row per product manufacturer) but that just "knocked off" one of the values. (I.e. for "Bigga" I'd end up with either 16/0 or 0/2, essentially depending on which status was found first.)

Round 2: I went back to the query determined to do the grouping properly in SQL.

...
m.ManufacturerName,
CASE t.TransactionType
WHEN 1
COUNT(t.TransactionID)
ELSE
0
END AS SoldCount,
CASE t.TransactionType
WHEN 2
COUNT(t.TransactionID)
ELSE
0
END AS ReturnedCount
...

This required t.TransactionType to be in the GroupBy clause which resulted in query results pretty much identical to what I was seeing in the report.

Round 3:
...
m.ManufacturerName,
SUM(CASE t.TransactionType
WHEN 1
THEN 1
ELSE 0
END) AS SoldCount,
SUM(CASE t.TransactionType
WHEN 2
THEN 1
ELSE 0
END) AS ReturnedCount
...

And Bingo!

Cars
--Vrroom 14 0
--Bigga 16 2

Dolls
--Munchkin 22 4
--Slimer 7 1

(Also, if a transaction had a quantity you wanted to count by then it'd be summing on t.Quantity rather than "1" (transaction count))

Thou shall not transmit localized DateTime values.

Date and time formats come in a variety of flavours, but one combination causes countless bugs, crashes, and potentially numerous lawsuits: dd/mm/yyyy vs. mm/dd/yyyy. If you ever come across "It worked yesterday, but it's crashing today" pretty much anywhere other than North America you've met this little gem.

The important thing to remember about DateTime values is that they are floating point numbers. 06/01/2010 is NOT a date, it is the localization of a date. The computer you give that localization to has to interpret what the actual date is. Is it January 6th, or June 1st? ANY, and EVERY time you need to transmit or transfer a DateTime from one source to another, you MUST take this into account. Generally this means every time the datetime transfers from one machine to another, such as to a Database server, a reporting server, or a web service. Today that may be two machines that have the same regional settings, but it just takes one server or client PC to be set to a different regional date format to totally hose your system. One thing you should NEVER see in code is a ".ToString()" (empty parameters) applied to a DateTime variable.

Now you don't need to go and start passing DateTime values around as floating point numbers to avoid this issue. (though that is one option.) There is one other practical option when transmitting DateTime values is to use ISO formatting. International Standard formatting for DateTime values is: yyyy-MM-dd HH:mm:ss. That is, 2010-01-06 19:04:30. In .Net this is commonly known as "Sortable". The main reason why this is advantageous is that regardless of their local date formats, any application worth its weight in salt *will* accept an ISO DateTime with its 24hr time format. ISO DateTimes also have the advantage that they are fully sortable so you can sort down to the month, day, hour, etc. or generate sequential numbers for things like filenames.

So there's no need to abandon the DateTime field type whether in .Net, SSRS, Crystal Reports, or SQL Server / Oracle or start farting around with regional settings; Just adopt ISO date and time formats when passing around dates.

Saturday, May 1, 2010

Linq2SQL is NOT an ORM

I think this needs to be stressed as much as possible. Linq2SQL, and other deviations that generate an object model based on a relational database are NOT ORMs.

ORM: Object-Relational Mapper

"Mapper" is the key missing ingredient with solutions like Linq2SQL. The key driving concept of an ORM is to allow you to develop an object model in a manner that meets the business needs of an application, or suite of applications, completely independently of the data source or sources serving that business. This doesn't mean that a product tied to a particular RDBMS has to be any less of an ORM than one that can serve a number of RDBMS; that isn't the point of the mapping. The point of the mapping element is that the views in your application can be served by domain objects designed specifically for the purpose of serving those views, irregardless of the data structure behind the data of that view. If it helps, think of earlier implementations of Views within an RDBMS, or stored procedures that were designed to flatten and translate highly relational data into a data form designed to serve a specific purpose. An ORM takes over for that role.

Linq2SQL in no way does any of this. A more accurate acronym for Linq2SQL and the like would be a ROG, or Relational Object Generator. Linq2SQL generates objects in accordance to the relational model within the database for the application to consume. Not that this approach does not have its merits for certain situations, but it is a completely different barrel of fish.

Any "ORM" that purportedly generates an object model from a database is flat-out not an ORM. There is no mapping when the relationships are 1-1 between objects and data tables; It's a generator.

Friday, April 9, 2010

When configuration goes pear-shaped.

This one was a rather amusing recent development in the system I am currently assigned to maintain.

The system is highly configurable. Several of the driving queries used within the application are themselves stored as SQL within the database. Even the menu structure is built based on a query stored in the DB. The premise behind this was that if we need to tweak how these queries pull data we can do so without re-deploying the application. (A Silverlight + Web Services business app.) Wonderful!

So an issue comes up where some data is coming back in the wrong sort order. It turns out to be an issue between #null vs. empty string values. A relatively simple change to the stored SQL statement should be all that's necessary so I test and script the change, deploy it into the test environment, then close and re-open my SL client to be sure and....
The old query results are still coming back.

Those beautiful dynamic SQL statements are cached in the application pool which means after changing a script you got to cycle the server. *sigh*.

The drive for configuration is mostly valid because we want to avoid making changes to the Silverlight client as much as possible. (A whopping 3.5MB D/L whenever this changes, [recently shrunk to 2.7MB] which with several hundred clients and more on the way will eat dearly into our ISP upload thresholds.) But queries like this are executed server-side anyways so it's easy enough to update server-side code without re-building the SL client.

Regardless it's not that bad of a scenario, but just goes to show that often the best laid plans still develop sizeable potholes.

Wednesday, March 24, 2010

Using consistency's name in vain.

This is a bit of a pet peeve, and something I've heard repeatedly to justify bad code and/or bad design. "It needs to stay consistent."

Question: How can code or design improve if it must remain consistent?

Code and design should be free to evolve within the lifespan of a project, to freely address concerns that come up with the initial approach and constantly challenge whether or not the original assumptions were really the best decision for the business. Going back through all existing functionality and re-factoring based on new technologies, patterns, or preferences of the customer can be prohibitively expensive, but that should not be an excuse not to adopt it going forward... <caveat> *If* it is what the customers wants. </caveat>

A consistent approach to an application is important, but it's a bad idea to take it so far as to make it extremely rigid. A common pitfall is with UI. For instance, stressing a design that says that all forms will have a default set of action buttons for wizard-like behaviour (Next, Prev, Cancel, Skip) is just asking for trouble when the customer wants to do something different. Pretty soon the "Cancel" button gets re-labelled and on screen X, performs action Y. Designing all forms to be dynamic, based based on some construct of a framework leading to slow, buggy, and not-so-user-friendly experiences is another example.

It's not that designs like this are inherently bad. They aren't, but they aren't guaranteed to be suitable to 100% of cases out there.

Saturday, March 13, 2010

Explicit Interfaces

The explicit use of interfaces is something I really try to follow and encourage others to start using.

Implement Interface Explicity
The second option for implementing interfaces implements the interface member effectively as a guarded public method, accessible solely through an interface reference instead of a public member. I don't think I've come across a development team that has used this option, but after experimenting with it, I think it's something that should be adopted a bit more by default.

My goal when writing code is to define contracts for the interaction between distinct units of work. This gives me the flexibility to swap out classes easily and quickly. This means I'm predominantly working through interfaces. The most significant benefit of implementing interfaces explicitly is that it means any change to the interface is immediately picked up in the implementing member(s) by the compiler. This means if I re-factor code and remove methods from an interface, the compiler notifies me on the next build that I now have dead code in implementing members.

By designing concrete classes with explicit interfaces, the public API for those classes are kept tidy, and it helps enforce usage through the contract interfaces. I hate seeing code "twisted" by having concrete references constructed and passed around. It defeats the point of defining an interface in the first place.

Friday, March 12, 2010

Regex is dead! Long live the Lambda!

Lambda expressions are cool. Linq is cool. But Lambdas and Linq can be a quazi-pain-in-the-ass when it comes to inspecting and debugging the process flow of your application.

Remember regular expressions?

"Some people, when confronted with a problem, think 'I know, I'll use regular expressions.' Now they have two problems."

A famous quote from Jamie Zawinski. Regexes are cool. You can do a lot with them, but the trouble is that you can do a lot that you really, really shouldn't with them.

Linq and Lamba's fall exactly into this category and I fear in a few years of misuse and abuse, people will be shivering at the sight of Lambda the same as they slink away from regular expressions. I've already seen several examples of Lambda expressions that make my skin crawl.

One common Lambda misuse is the same damn misuse you see with looping structures. "x", cursed "x". And when "x" is used up, you start to see "y", "t", "i", and other letters popping around inside complex nested lambda expressions. I even started doing it myself and then I decided that I deserved a punch to the head if I continued this nonsense.
"x" kills readability.

Lambda is also tough to debug, and unfortunately even with VS2010 the debug windows (such as Watch) still cannot compile them, and they probably never will. The issue with Lambda as I see it is that it is a very powerful and useful tool, but just as prone to misuse as regular expressions. A word of prophetic warning to developers out there: "Just because you can do something with Lambda, doesn't mean you should do it with Lambda."

"I've got string input... After it's been massaged, squeezed, filtered, and transformed by my uber Regular Expression I... I'm not getting the result I'm expecting... WTF?"

"I've got data... After it's been massaged, squeezed, filtered, and transformed by my uber Lambda expressions, I... I'm not getting back the results I'm expecting... WTF?"

I can't tell the difference, can you?

Let me be the first to coin the phrase:
"Some people, when confronted with a problem, think 'I know, I'll use Lambda expressions and LINQ.' Now they have two problems."

Monday, March 1, 2010

Divorcing the Framework

If the goal is to reuse code between projects, and adopt a consistent approach to building applications, it is possible to accomplish this by convention rather than spending a lot of time trying to invent a framework.

As a starting point I would highly recommend reading "Clean Code" by Robert C. Martin (Uncle Bob) as a freshening view of how to write better code. My own realization was to compare software development with writing or painting. Many of the best writers out there still start their works with pen (or pencil) and paper, not on the computer. The reason for this is because they are applying a time-honoured technique of drafting down their ideas before reviewing and refining them. Painters don't just crack out the oils and produce a finished portrait. They draft in lines, paint over what they aren't satisfied with, and gradually build a portrait through refinement.

This is where software fails. Too often, a developer will write a block of code to satisfy a requirement, verify that it "works", and then move on to the next requirement. There is little to no refinement involved, either from performance or stability (engineering) or how well the intent behind the requirement is satisfied. Only later, when the product is put through its paces, do issues around performance or behaviour start coming to the surface. But by then the code is draft built upon draft, built upon draft.
Frameworks only compound this failure by trying to restrict options in how to accomplish a task, and significantly increase the investment required to refine any specific scenario. The failure to draft ideas and then re-factor not only the code, but the design & requirement is the paramount failing of a project.

The second step to building better projects is to gain an appreciation and understanding for Object-Oriented software development, S.O.L.I.D. principles, and design patterns. Technologies change all of the time, but the fundamentals don't. Learn to write self contained units of work that have clear boundaries. Develop automated unit tests around these units so that these boundaries are monitored. Getting to this point is 90% of the battle. Each unit of work can now be utilized in any number of projects, allowing a consistent approach to a specific domain problem. If it doesn't suit one given scenario it can be substituted or extended without corrupting the original. There is no framework, no configuration, only simple components representing units of distinct work. Tying these together is basic, fundamental OOP inheritance and composition via interfaces. They should not care how they are persisted, how they communicate with one another, or even what concrete implementation they are communicating with. Too often, these details (persistence, and communication) become the driving force behind building a framework. They start introducing limitations on the desired behaviour of the application, and the instant this happens, the framework is working against you. Clients don't pay developers to tell them what they cannot have, or to come up with ways to make what they want more expensive.

Most frameworks that I've worked on have been designed with a lot of good coding principles and patterns in mind, but also with a significant amount of investment in the intangible. Teams have invested many hours into "framework code" that does not directly meet functional requirements. This forms a type of debt that they expect to get back once the framework is mature enough and features can be quickly bolted in and configured. Unfortunately the vision at the start is incomplete and a lot of extra time is needed to adjust or work around limitations from earlier assumptions in the framework.

The benefit of breaking things down into units of work is that it opens the door to update popular components to new technologies while still allowing them to interoperate with older components. There are challenges as components are upgraded, but these are only in the ties between components. By following good patterns and practices these challenges are easily dealt with. A good example is when dealing with collections of objects. In .Net 1.1 the common approach was using ArrayLists while .Net 2.0 and onward adopted IList<> generics. If we re-factor units of work to take advantage of technologies such as Linq but still want to utilize these components in existing products, we need to manage the translation between ArrayLists and Lists via adapters.

The overall goal is to keep things as simple to implement as possible with as little excess up-front investment as possible. By breaking up applications into shareable units of work, implemented in simple and consistent ways, developers can accomplish their task while remaining flexible to future change. This is something a framework will never give them without first taking its pound of flesh.

Sunday, February 28, 2010

Descriptive Code

I'm not a big fan of the term "self-documenting code" mainly because it seems to light the fires under both extremes of whether it's nobler to comment everything, or wiser to insist all code expresses what it does perfectly well. As with everything, moderation is the key.

I've participated in project teams that believe that every method in every class should be commented, and in others where I was told in a code review to remove all comments. At the time I was a bit confused why I would have been told to remove comments, but over time it did begin to make sense.

The objective of self-documenting code is that the code should describe it's intent without the need for additional external documentation.

My original attitude about commenting code was that while anyone reading code should be able to understand "what" the code does, there should be comments to reflect "why" the code does what it does. However, the trouble with commenting is that as you re-factor code, the intent of the code can change from what was originally commented. "Comments Lie."

On the other end of the spectrum, commenting everything is a complete waste of time and just leads to copy & paste errors, and more lies. Why do you need a comment like this on something like a Customer Presenter?

/// <summary>
/// Saves the provided Customer.
/// </summary>
/// <param name="customer">Customer to save.</param>
public void Save(Customer customer)
{
//...
}

Ok, if your intent is to be able to produce documentation about an API, by all means comment public interface methods and comment them well; but for general code? This is just a waste of time. Most of the time when I see this in practice there are plenty of examples where the comment headers aren't accurate anyway; referencing parameters that were removed, or missing parameters that were added.

I still stand beside my original belief. Code by itself is misleading, it only tells half of the story. I can see what it does, but not why it does it. The best advice I can give regarding self-documenting code:

1. Code should be organized into distinct units of work. It should do one thing, and one thing only, even if that is just to coordinate other distinct actions. The name of that method should perfectly describe what that code is supposed to do.
For example: private void updateOrderStatusToDispatched(Order order)
This method name perfectly describes what you intend its code to perform. When this method is called it is crystal clear what you are intending to do.

2. Reduce, or better eliminate the use of conditional logic in your code. Nesting ifs/else ifs or a switch statement is a good indication that you can break code down into smaller units of meaningful work. 90% of the "if" statements I use in code are for fail-fast assert behaviour. If the method should do nothing based on the parameters it's given, return now. If a parameter isn't valid, throw an exception.

Where you do have conditional bits for a reason, be sure to use a comment to describe "why" that condition exists. Too many times I've seen code break when some conditional term was added for one particular case which was not documented, it caused a bug in another case, so the bug fix of removing that condition fixed the one case, but broke the original. You might remember why you added an "if" condition around a particular block after 1 month, or maybe 3 months, how about 6 months? two years? And what about the other poor souls that are working on the code base with you?

3. Write unit tests around the behaviour of the functionality, not specific code implementation. Something adopted from BDD (Behaviour-Driven-Design/Development) is that a unit test can represent the original functional requirement for the application. In this way, the code being tested is documented with the "why" it is implemented the way that it is.
For example:
[Test]
public void EnsureThatWhenACustomerConfirmsAnOrderTheOrderIsDispatched()

This test method would not test a method like updateOrderStatusToDispatched() directly. The update method is a private member, but it does test the behavior of when a customer confirms an order and asserts that the expected behaviour of that action is that the order should be dispatched.

Methods such as updateOrderStatusToDispatched() serve self-documentation by describing precisely what their intended behaviour. By incorporating unit tests that represent the various requirements of our application, our code is now documenting itself in ways that are virtually impossible with separate requirements and design documents. If I change the behaviour of the code unexpectedly, my unit tests break telling me I changed behaviour. If I need to change or add behaviour, the changes to my unit tests assert those changes as I re-design the code implementation. Keeping a separate "Word" document up to date for a software design is frustrating by comparison.

Wednesday, February 24, 2010

It's official. I HATE Reporting Services.

I started working with Reporting Services about 2 months ago. Coming from a Crystal Reports background, and all of the trials and tribulations that go along with that, I was skeptical but optimistic about working with an alternative.

I put Microsoft's earlier attempt to provide a reporting solution other than Crystal with Visual Studio behind me (which ended up little more than a glorified Print Form) and dived in.

At first I was liking what I saw. Shared data sources was a little bit twitchy, but overall it worked well, and deployment to the report server was a breeze, either through Visual Studio or the web interface. Bringing up a report in a pop up window was a snap. It seemed a bit "wrong" that users would need to install something in order to print those reports. I don't see where/if/how/why HTML rendering wasn't provided. But in all, getting some simple reports out to local users was quite pleasant. Then the trolls started lurking out from under their bridge.

The first point of pain was when I had to generate a rather complex report similar to a cross-tab. Getting the data into the report wasn't a problem, but keeping the report confined to a single page proved to be a challenge. The issue was due to some form of moronic layout engine Microsoft adopted for the report designer. I had a set of text and calculated fields up at the top of the report (my report header) and a tablix control well below them. The tablix control would dynamically create columns based on the report data so it was sized quite small. For some reason the controls starting right of the tablix would shift to the right to always be printed after the end of the tablix control. I could understand this if the controls somehow overlapped, but these were up at the top of the screen. The result was having header controls pushed off onto a second page

No issue, I should be able to just create a section for the report header to stop confusing the layout with the tablix control.... There is no concept of sections in the designer.
My only option was to place these controls in the page header, which in this case is not a major issue but it doesn't give me an option to use a report header.

The real kick to the teeth with Reporting Services is the draconian lock-down making it virtually impossible to expose a report in an extranet capacity. I can understand arguments for not wanting to publish reports on the Internet, and by all means, educate developers to avoid such configuration, or how to do it securely, but to just lock the option out completely? Come on! I have an external site that *isn't* part of our network that needs to get access to the reports. No, we don't use VPN tunnelling or want to set them up as part of our network in any capacity. We just want them to access a website (which is locked down by IP filtering) and be able to click on a link to bring up a report. But it's effectively impossible with Reporting Server 2008.

IMO A reporting service should just GENERATE REPORTS. Let me take care of who has access and how they access it; Reporting services, just produce the bleeping report.

Saturday, February 20, 2010

Frameworks are like marrying your Sister

I came across a rather interesting article of an opinion of ORMs & DBMSs here which is a bit of a push away from the old mentality, and considering new options when it comes to persisting data.

If using a DBMS is like kissing your sister, designing & building "frameworks" should be like marrying her. It might be a bit too strong (as one should never marry one's sister) but 99.5% of the time development teams come up with the idea that "we should write a framework" they really, really should think again. I'm of course talking about development teams who's job is to write "application" software, *not* development teams who's primary goal is to write a framework. (Ala DNN, etc.)

Frameworks are a good thing when your goal is to apply a consistent approach to a problem domain. When you invest in a framework, the reward is to utilize it in as many ways, and as many times as to make the investment worth-while. It isn't worthwhile to create a framework that is used once, twice, five times, or even ten times. If the framework can be used in 100 different projects or more, *then* it is worth while. Time is the key factor in why application development teams should *never* waste time trying to write frameworks. How long will it be before that framework would be used in even 5 projects let alone 100?

Time is every framework's enemy. Technologies are constantly changing and it takes quite an effort to keep any piece of software up to date to take advantage of the current mainstream technology. If your development team will be expected to produce 10 applications (or distinct major releases of applications) over the next five years, can you afford or justify keeping all previous versions of the applications updated while you're also supposed to be implementing new functionality on new products? If you let projects rot on the vine with previous builds of the framework, you must realize that clients are going to expect to modify and enhance those applications for years to come.

There are a number of arguments I've heard in favor of building frameworks such as "code re-use". You do not need a framework to re-use code. Libraries and components are code re-use. "You can re-configure a framework or plug into it to introduce future functionality". Sure, but that requires investing today what you may not, and likely will not need tomorrow. And how much extra effort is going to be spent when it turns out what you guessed yesterday was wrong, and need to re-work or work around the framework to implement today's functionality?

Sometimes, it's just that development teams have stars in their eyes; a dream that they can write the ultimate system today that tomorrow and for years to come they can sit back smugly as new functional requests come in where they can just tweak a couple settings, and write a simple module. (They dream of writing themselves out of a job. :) But there's the kicker, "years". It will take years, many years before any framework constructed by an application team returns it's initial investment of tens to hundreds of thousands of lines of code, specific technologies and back-ends all perfectly aligned to meet today's vision. Ask any developer today whether they want to be working with .Net 1.1? Five years from now, are they going to be "happy" maintaining a .Net 3.5 framework? Or were they planning on upgrading their framework, and every application on it to date to .Net 4, .Net 5, and .Net 6 over the next five to ten years?

A framework is like building a closet organizer. If you build it for only what you need today, it would likely be a waste of space, and expensive/messy to tear out and replace when your needs change. So you plan for the future and build it to be flexible, to accommodate every thinkable scenario... Until you realize you didn't think of a place for your ties, so you're left stapling them to the wall.

A framework built in .Net 1.1 such as DNN justifies it's existence by enticing 1000 projects to use it. It evolves into .Net 2.0 and beyond on the merits and feedback from previous versions, hopefully to be used on a further 10,000 or more projects. When your goal today is to write a handful of applications, and you get the idea of writing that mondo-cool framework: Think of your sister.

Monday, February 1, 2010

Unit Testing using IoC Principles and Mocking/Stubbing

Overview

This document is geared towards providing an outline to producing effective unit tests for functionality using inversion of control principles and mocking frameworks.

There are a few different, and sometimes conflicting definitions of what is a mock vs. what is a stub. Generally a mock performs validation about if/how a mocked method is called, where a stub is merely a placeholder to return a canned result to a message. For the sake of simplification, this document will use the term mock to represent both mocks and stubs. A mocking framework can easily accommodate both mocks and stubs.

Inversion of Control and Testing
One of the key benefits of adopting Inversion of Control (IoC) is the enablement of testing classes in isolation. Take for example a scenario where we have a class responsible for providing functionality around domain objects. There is an additional class that provides validation of business rules such as permissions, and another class that manages interactions with persistence.

We could write this scenario similar to below:

It’s pretty straight forward where there is a good separation of concerns between objects. But how do we test the LocationProvider or LocationValidator class? LocationProvider is dependant on LocationValidator and DataContainer. LocationValidator is dependant on DataContainer. Assuming that data container has some method of being configured with a connection string, all we can hope to do is set up a test case to ensure that DataContainer points at a test database and the test setup should probably be inserting records into that database to ensure that the expected data is present.

Another important consideration from above is that any test we would write for LocationProvider would now be dependant on the behaviour of LocationValidator. This problem is compounded as the number and depth of dependencies increases. Our test case for LocationProvider would need to set up data in order to ensure that the Validator passes, or fails as expected. What happens when the Validator rules (behaviour) change? Our LocationProvider unit tests, and any tests that are dependant on the LocationValidator or the LocationProvider will also fail because they would need to be updated to set up data according to the new rules.

This scenario may seem extreme, but it is the primary cause behind the frustration and eventual abandoning of unit testing. Tests become too complex to set-up, and far too brittle to be written early in the development cycle. It may not be as extreme as this where data is the central contention point for unit tests. Developers experience and then fear situations where changes to one class start breaking unit tests all over the project. This creates “noise” in the test suite that takes time to understand and fix all of the affected tests.

It certainly doesn’t need to be like this.

Using IoC to De-couple Dependencies
Inversion of control provides a mechanism to de-couple dependencies from one another, allowing each dependency to be tested independently. IoC can be implemented a number of ways, such as providing dependencies in the constructor, setting dependencies via properties, or passing necessary dependencies as parameters within the method calls themselves. You could define dependencies as concrete instances, but for the interest of test-ability the dependencies will be defined as interfaces so that they can be substituted. The following example changes the scenario to provide dependencies via the constructor.

The code around the functionality looks pretty much the same as the original example. The key difference is that the instances of the dependencies are now provided when each class is constructed. Now we have something we can test.

Testing in Isolation with Mocks
Lets first look at writing a test for our Validation class using Moq. (2.6) *Note: for newer versions of Moq (3.0+) substitute the “Expect” method name with “Setup”.

In this example, we’ve created a stub object for the location (1), and another for the DataContainer dependency.(2) The data container is configured to expect a meth call to “CountObjects” and is instructed to return a non-zero value. (3) We create the validator passing it in our stubbed data container. (4). The test then asserts than the validator returns back “False” based on the results.

By definition, the data container substitute that we’ve created is a stub, not a mock. The reason is that this object doesn’t validate that it was actually called. This would be better served by a true mock instance, which can be done by modifying the test as follows:

Converting this to a true mock by adding the verification is important because it ensures that the CanDelete method actually does call the data container. (The business logic says it should.) The mocking framework would automatically detect if a different method was called, or different values for parameters, but it doesn’t report back whether a call actually occurred unless we ask it.

A note regarding the naming convention for the test: This example unit test is named to reflect the behaviour, or business requirement of the code being tested. This is in the spirit of Behaviour Driven Development. (BDD) By testing around the behaviour, the unit test itself becomes part of the design for the given system. It defines a required behaviour, and as an integrated part of the project, anyone seeking to change to the requirement (test) or implemented behaviour (code) will have to ensure that the two reflect one another.

The next obvious test scenario is that CanDelete returns True if user locations are not found.

More on Mocking Frameworks

Mocks and stubs can be hand-rolled by creating test classes that implement the various interfaces. Mocking frameworks like Moq are invaluable because they hide the messy extras of implementing interfaces where you’d need to include signatures for every method in the interface. With mocking frameworks you only implement the interface methods you want to use. The IDataContainer may eventually have 20 or 30 methods in it but with the mocking framework your tests can safely be written around only the methods you care about. If you hand-roll a stub class, as you extend the interface you’ll need to continuously go back to the stubs to add place-holder methods to match the interface.

There are a number of mocking frameworks available for .Net including Rhino Mocks, Moq, NMock, and EasyMock.Net. Rhino Mocks is arguably one of the oldest and feature-rich. Moq is a “back to basics” mocking framework written around .Net 3.0 offering an elegant, simple tool for defining stubs and mocks. My argument for going with the basic and elegant option is that unit testing should strive to be as flexible and non-intrusive as possible. If tests adopt complex mocking behaviour you end up spending a lot of time trying to build tests to assert what really should be basic concepts. If you find you need to mock out complex scenarios then you either need to take another look at how you’re structuring your dependencies, or you’re testing far too deeply into dependency layers.

Testing Layered Dependencies
The real power of using IoC coupled with mocking presents itself in scenarios where dependencies are layered. Lets look at the case of testing the LocationProvider class. This class had a method to perform a delete that would ask the validator then instruct the data provider to perform an action if the validator gave the go-ahead. We already have unit test coverage for the validator to ensure that it does the right thing with inputs and outputs, so as far as tests for the location provider go, the validator can be mocked.

We create our stub object again (1) and a mock validator. (2) We instruct the validator to expect a call to CanDelete with our stubbed location, and to return True giving the go-ahead to delete. (3) We instruct the data container mock to expect a call to DeleteObject, provided with our stubbed location. (4) Note that these calls will fail if either call is made with anything other than the reference to our stubbed location. We create an instance of our provider class providing the two mock dependencies, and call the Delete method with the stubbed location. (5) The final step is to verify that our mock methods were actually called.

Notice that we don’t need to add any complexity around trying to ensure what the validator does, where the data container would be expecting count requests and such. That behaviour is already tested. All we care about is that a delete goes ahead when the validator is satisfied.

The next scenario to test is that a delete *doesn’t* occur if the validator reports back a no-go condition.

The data provider won’t be called if the validator returns false so that obviously won’t pass if we try to verify the call, so we removed that line and the verify. But wait, now this test doesn’t actually test anything other than that the call to the validator occurs. This is ineffective. What we really want to do is verify that the data provider *doesn’t* get called in this scenario:

To accomplish this, we put the .Expect call back in. But instead of doing a verify against it, we instruct it to throw an exception. This will cause the test to fail should the data container be instructed to delete an object even if the validator said no-go.

This type of enforcement with mocks is invaluable to help detect and prevent aggravating bugs from working their way into the system undetected until late in the development cycle, or worse, when the product makes it into production.

Lets look at a simple example of how this kind of thing could happen.

Cowboy Coding
Meet Cowboy Cody. Cody’s a great coder, quick as a whip. Cody starts off running with his new enhancement, to add a condition to change some rules around deleting locations. He looks at the existing code and gets as far as the Location Provider spotting the method he wants to change. “What’s this validator thing? Aw, hell, that kind of stuff isn’t necessary, all I want to do is add this one little condition. I don’t want to go through the trouble of adding a parameter so it’s going to be as easy as adding a module-level variable. Haha, 5 minutes and I’m done!”

Lets see what damage Cody’s done…

At a glance it looks harmless enough. Cody was at least good enough to go into the application and try out some of the conditions around his new 5-minute flag. He’s checked his code in.

Prudent Pat, the team lead, sees that changes have been made to the project and he immediately runs the test suite before starting his own enhancements. Red Light! “Ensure Delete Doesn’t Occur If Validation Rules Are Not Satisfied” has failed, DeleteObject has been called!

It turned out that the default for _someOtherCondition was “true”, and Cody has now introduced a behaviour change bug where his added condition can override the other validation rules. Pat consults Cody and Cody admits “Whups, that should have been an ‘&&’”

The next morning, Cody’s assigned the task of “fixing” his change properly, in the right place, and correctly ensuring the unit tests reflect the behaviour.

Conclusion
Hopefully this is a useful guide for how to effectively apply IoC principles in combination with a mocking framework to write unit tests. Writing unit tests can often be a bit of a daunting task but gradually you will find the quality of the tests you write goes up, as the time spent writing them and managing breaking scenarios goes down.

One Final Mention
Unit testing is nothing that should be dramatically increasing the amount of time you spend writing code. Every developer factors in time to start up a project as they’re working and try out their changes as they are working on them. Adopting unit tests is taking the majority of that time investment and using it to create tests that are repeatable. The initial investment is larger than the time you might spend trying out a change, but hopefully you aren’t in the habit of only trying things out when you think you’re “done”. The cost of running existing unit tests to ensure you haven’t accidentally broken anything is a lot less than manually running through existing functionality as you introduce new functionality.

Friday, January 29, 2010

Append-Only Data Models

An append-only data model follows that when a data entity changes, each change is represented as an insert against the relevant table. There are no updates or deletes recorded against the data. The domain model and any adopted ORM strategy needs to reflect this behaviour difference. Soft-delete is a term used when records are marked inactive rather than permanently deleted from the database. This is an integral part of append-only models, but can be easily employed separately from append-only models.

Advantages of Append-OnlyAppend-only models offer several advantages to applications. The most significant is a built-in audit trail for reviewing all changes to records and a snapshot of a record at any point in time. This audit trail can be used for reporting, undo-chains, and review/approval behaviour.

Challenges of Append-OnlyAppend-only models automatically increase the amount of storage space needed to represent data historically. This proves to raise additional challenges when dealing with ORMs to ensure references correctly point to the latest version when applicable, and avoiding new associations with stale references.

Structuring Data for Append-OnlyThere are 3 conventional approaches for adopting an append-only data model. Each addresses the fact that for any given data entity there can be one or more record in the database to represent it over time.

Key + Version Number
The first option is to introduce a version number and form a concatenated key between the record ID and Version. New records receive the incremented version number. An optional table can be introduced to help indicate the most current version number for a data entity rather than performing index scans against the table.

Advantages

ID is preserved between versions. (Suitable for meaningful IDs)
FKs by ID are preserved. (Though not unique by themselves so caution is needed when querying.)
Ancestor records are “pure”. (Unmodified when new descendants are created.)
Old records can be archived/deleted easily.

Disadvantages

Concatenated PK.
Not intuitive to map relationships that are versioned.

Timestamps can be added in place of version numbers, or complimenting version numbers to provide functionality to assess a record’s state a particular point in time.

Ancestor Reference + Version Number
The second option is to use a version number in combination with a reference (FK) to the record’s ancestor. This approach utilizes a meaningless PK for the record so each new version has a unique ID plus a reference to the previous version. (The ancestor.) The version number is tracked as well as a reference to identify the most current version of the record.

At some point we may want to archive or discard older records. This can be a bit of a problem with the FK relationship between record and ancestor when the ancestor is removed from the production data.

Setting a revision’s ancestor reference to null is fine if the ancestor is deleted, however when archiving, special care will be needed to re-attach the reference when the revision itself is moved to the archive.

Advantages

Unique IDs suitable for FK relationship. (No concatenation needed.)
Ancestor records are “pure”. (Unmodified when new descendants are created.)
Maps easier to ORM solutions, though care is needed to ensure stale references are avoided.

Disadvantages

External FK relationships must be updated to the new version.
May require index scans like above solution to keep references fresh.
Old records cannot be easily removed due to FK relationships.

Descendant Reference
The third option is to use a reference (FK) to the record’s descendant. This is similar to the second approach however instead of the current object holding a reference to its elder, where each elder is modified to contain a reference to its descendant. The advantage in this model is that the version number is not required, as the current instance will always have a null descendant ID. This makes it simple to detect when a reference has been made stale.

It also facilitates deleting and archiving better than the ancestor model since records can be deleted without invalidating FKs. The archived records, not remaining production records, merely require a placeholder proxy for the cut off point descendant.

Advantages

Unique IDs suitable for FK relationship. (No concatenation needed.)
Simpler detection for current instance and stale references.
Old records can easily be removed or archived.

Disadvantages

External FK relationships must be updated to the new version.
Ancestor records are not “pure”. Ancestor records must be updated with their descendant reference.

Snapshots
A snapshot refers to a model where a snapshot of the data is taken and recorded before each and every data modification. This is not append-only by definition because it serves only to provide a picture of historical data, but it is not geared towards making versioned data accessible to references. Snapshots can be recorded in the same table, or in separate history tables. Snapshots can also be configured to be performed automatically by the database. This can be beneficial in situations where data can be manipulated by more than one application or process.

What Version Do I Point To?
Regardless of the approach taken, it is important that the domain model for the application is designed with append-only behaviour in mind. The most significant challenge in adopting this model is that you need to be explicit when dealing with the relationships between data. Certain design and behaviour decisions are needed within the application. If a revision is made to an object that is referenced by other objects, should those references be automatically updated to the latest revision, or is it valid that they can remain referenced to the version that was current when the association was made?

For example, if we make a change to a Doctor entity to update contact details or other relevant information, it would probably be beneficial to ensure that any appointments associated to that Dr., past, present, or future, should be updated to the current revision of the Doctor’s record. Alternatively, if we were to adjust a billing rate applicable for a service, we’d likely want to ensure that past, and possibly present appointments refer to the previous version of the record, while new and future appointments reflect the updated rate. (Possibly prompting users to ask whether or not present and future appointments should be updated or not.)

This is an important consideration when designing domain and data models around this kind of behaviour. For instance, having a service such as an Interpreter in the system, we’d want to be sure that information such as contact details would be kept separate from figures such as billing charges. Some data we may want to ensure is always pointing to the current revision, while others may be more selective.

In the case of selective association, this is one scenario where option 1 outlined above becomes a clumsy choice. Any foreign key association where the revision can be selective requires both the ID and version number. In the case of an appointment to service provider billing information, the appointment will likely have a revision number, as well as the ID to the billing information and the version of the billing information.