Thursday, October 16, 2025

Shrinking JSON for Data Storage

 Lately I have been working on a pet project for helping visualize how an offset account impacts one's mortgage. This has mainly been a foray into building apps in Blazor. The goal is to have a version of the app that can run in a fully offline mode leveraging local storage, which I have used Blazored.LocalStorage to assist with.  Behind the scenes it is serializing objects to JSON to be stored in the local storage which works just fine.

However, this did raise a bit of a concern in that the data being stored would be entered over years worth of entries to track transaction amounts to compute the offset impact. Each record is reasonably small, but there will potentially be a fair number of them, and I'd like the option to include things like descriptive names. JSON itself adds additional storage concerns as the objects and all values need to be represented as strings. With databases in modern systems, space is cheap for storage and indexing. With local storage we are limited to 5MB for storage. This means we need to be a bit more cautious with our data.  This turned into a rather interesting problem to find a balance between ease of use and storage space.

The first thing I looked at was ensuring that I was only storing info that was absolutely needed. With a relational database a Transaction record might look something like:

TransactionId, TransactionDate, AccountId, Amount, Description

With this offline store, it resembles more like a document store where the transactions will fall under an account document. We still want an Id for the transaction within the document, but we can get rid of the FK.  Now when it comes to the description, many rows will be null, but for ones that people fill in, there is a good chance that many will repeat. For this I decided to normalize by adding a TransactionDescription table. This replaces Description with TransactionDescriptionId, being an integer rather than the string in each row. As users start typing we can look up common descriptions and offer to link to an existing one or create a new row for new values.

When it comes to dates I decided to treat these as raw values with an unmapped transalation to a C# date. So dates would be stored in a short-form ISO format of "yyyyMMdd".

For dollar amounts I elected to store these in cents, letting the DTO expose a floating currency value.

So a typical record would start to look like:

{ "TransactionId":1, "TransactionDate": "20251012", "Amount": 1545, "DescriptionId": 32 }

... where if the transaction had no description:

{ "TransactionId":1, "TransactionDate": "20251012", "Amount": 1545, "DescriptionId": null }
91 characters.

This is already a bit more compact than if I took a more default C#, denormalized object serialized to JSON. Though we can do better. For a start, when serializing to JSON there is an option on how #null values are handled, either included or excluded. For data contracts it makes sense to include them so consumers know they are a value in the destination object. For data storage though we can ignore them. In the second case with no description this becomes:

{ "TransactionId":1, "TransactionDate": "20251012", "Amount": 1545 }
68 characters.

Next, we can use the [DataContract] and [DataMethod] attributes. Rather than the default opt-out serialization behaviour where we would need [JsonIgnore] to exclude properties we don't want serialized, if we use [DataContract] we need to opt-in properties to include with [DataMethod] but we can also specify a value name. For instance with my above simple transaction, the DTO looks something like:

    [DataContract]
    public class Transaction : IDto
    {
        [DataMember(Name ="id")]
        public int TransactionId { get; init; }

        [DataMember(Name="dt")]
        public int TransactionDateRaw { get; init; }
        public DateTime TransactionDate => DateTime.ParseExact(TransactionDateRaw, "yyyyMMdd", null);

        [DataMember(Name ="am")]
        public long Amount { get; init; }

        [DataMember(Name ="td")]
        public int? TransactionDescriptionId { get; init; }
    }


The JSON now looks like:

{ "id":1, "dt": "20251012", "am": 1545 }
40 characters.

We have gone from 91 characters to less than half at 40 characters. 

The next change I made was around the date. I was going to want to provide daily figures for things like the balance, while supporting that multiple transactions on a given day. When it comes to loans with an offset account, the general recommendation is to use the interest free period on credit cards for daily transactions and pay the card in full each month. This reduces reporting on each transaction but we'll still have things like bill payments and other transactions falling on the same day. The structure I ended up with pulled the date value to an AccountDay class containing the starting and final balance with a collection of transactions. This pulled the repeated date out of the transaction rows.

The offset account manages a SortedList<int, AccountDay> collection of days indexed by the date (yyyyMMdd format)  The resulting JSON looks like:

{"na":"Test Account","dz":{"20251001":{"ib":5000,"tx":[],"ba":5000},"20251002":{"ib":5000,"tx":[{"id":1,"am":44400},{"id":2,"am":29153},{"id":3,"am":17166}],"ba":95719},"20251003":{"ib":95719,"tx":[{"id":1,"am":39177},{"id":2,"am":39521},{"id":3,"am":27103},{"id":4,"am":41982}],"ba":243502},"20251004":{"ib":243502,"tx":[{"id":1,"am":43598}],"ba":287100},"20251005":{"ib":287100,"tx":[],"ba":287100},"20251006":{"ib":287100,"tx":[{"id":1,"am":12655},{"id":2,"am":9537},{"id":3,"am":38402}],"ba":347694},"20251007":{"ib":347694,"tx":[{"id":1,"am":46109},{"id":2,"am":43684}],"ba":437487},"20251008":{"ib":437487,"tx":[{"id":1,"am":11688},{"id":2,"am":2481}],"ba":451656},"20251009":{"ib":451656,"tx":[{"id":1,"am":40970}],"ba":492626},"20251010":{"ib":492626,"tx":[{"id":1,"am":32517},{"id":2,"am":20656},{"id":3,"am":17146}],"ba":562945},"20251011":{"ib":562945,"tx":[{"id":1,"am":1350},{"id":2,"am":21341}],"ba":585636},"20251012":{"ib":585636,"tx":[],"ba":585636},"20251013":{"ib":585636,"tx":[],"ba":585636},"20251014":{"ib":585636,"tx":[{"id":1,"am":41260},{"id":2,"am":31053}],"ba":657949},"20251015":{"ib":657949,"tx":[{"id":1,"am":47606}],"ba":705555}}}
It isn't very readable by any stretch, but it is compact and deserializes back into the object model where all of the work & formatting is done.

My thoughts are that I'll load and save account data in parcels by month or by year, depending on the final data size and trying to ensure this tool remains device friendly. For instance if data is automatically parcelled and serialized by "yyyyMM" then the "days" key shrinks from "yyyyMMdd" to just "dd", saving even more space. 

I could have left it at this and let Blazored.LocalStorage serialize this structure, and I'd likely have enough local storage for the decades of data for the typical user, but Blazored.LocalStorage can also work with storing string values which got me thinking: What if I serialized and compressed the data? Compression should significantly shrink the JSON, but then I'd need to Base64 the string, adding about 1/3rd to the compressed size. From here I wanted to see what a typical use case might look like so creating around 15 years worth of transactions. The test generator built anywhere from around 35,000-50,000 records. With the original JSON objects, this amounted to a size of around 1.3MB. The shortened JSON properties dropped that to around 545kB. The last step was to see what compression might do.

    public static string? Serialize(this IDto dto)
    {
        try
        {
            string json = JsonConvert.SerializeObject(dto, new JsonSerializerSettings { NullValueHandling = NullValueHandling.Ignore });
            var buffer = Encoding.UTF8.GetBytes(json);
            using var fromStream = new MemoryStream(buffer);
            using var toStream = new MemoryStream();
            using var zipStream = new DeflateStream(toStream, CompressionLevel.Optimal);

            fromStream.CopyTo(zipStream);
            zipStream.Flush();
            string base64 = Convert.ToBase64String(toStream.ToArray());
            return base64;
        }
        catch (Exception ex)
        {
            return null;
        }
    }

    public static T? Deserialize<T>(string dataBase64) where T : IDto
    {
        if (string.IsNullOrEmpty(dataBase64)) return default;
        try
        {
            using var toStream = new MemoryStream();
            var data = Convert.FromBase64String(dataBase64);
            using var fromStream = new MemoryStream(data);

            using var zipStream = new DeflateStream(fromStream, CompressionMode.Decompress);

            zipStream.CopyTo(toStream);
            zipStream.Flush();
            string json = UTF8Encoding.UTF8.GetString(toStream.ToArray());
            return JsonConvert.DeserializeObject<T>(json);
        }
        catch (Exception ex)
        {
            return default;
        }
    }

With this I was able to get the data size down to 98kB using DEFLATE/GZip including the Base64 conversion. This also means I can easily package up and offer to save a backup of data from the local storage on request, and re-import into local storage.  

The next question was total time needed to serialize/deserialize with and without compression. Times did vary between runs but the general impact was that compression was going to add around 20% to the read and write times. This was while running on a PC so I'm far more cautious about this impact when running on a device once I have a suitable build ready for testing. However, this was reading and writing 15 years worth of data in one hit. Even in that situation it was adding around 100ms to a 500ms operation. The goal is to parcel data by month or by year worst case where the complete data set doesn't need to be loaded.

In any case this provided a lot of food for thought when thinking about JSON serialized data for data communication and storage as opposed to a more contractual or otherwise visible medium. When we want to package up data tightly for storage or transport then we can leverage options in the JSON serialization as well as compression when we have compute available on the consumer. The trade-offs to balance besides space/size and performance also include the complexities of working with the data and potential for bugs. (For instance if I move the "yyyyMM" aspect out of the AccountDay and rely on the loaded month "packet" of data.

Friday, August 29, 2025

When will AI's "Monsanto Moment" come?

 I cannot help but see parallels between the introduction of AI into software development, and the rise of gene modification in crops. Crops have had their genes modified through domestication for centuries, but the rise of corporations like Monsanto popularizing lab-based manipulation led to the patent system being used to protect that investment.  Back in the early 2000's there was a big push to try and scare consumers about supposed risks of GMO crops, and a lot of that fear carries over today with consumers demanding non-GMO produce on the assumption that it is healthier. The reality behind that push wasn't around health and safety, it was around money and the use of these patents to force farmers to change their practices. Prior to GMO crops, farmers would buy seed stock, and if they had a good harvest they could opt to save some of it as seed for the next year. When they bought GMO seed stock, stuff that was resistant to insects or to herbicides, etc. they sign a contract that bars them from being able to keep seed stock. Each year they would need to buy the full allotment of GMO seed. The company would even go so far as to sue adjacent farmers that might benefit from cross-pollination from GMO fields.  This rather heavy-handed treatment of farmers wasn't likely to garner much sympathy from the general public who only stood to benefit from better supply from the increased yields. Still, to attack back at these corporations, supporters attacked GMO any way they could.

Today, AI tools are becoming increasingly available, at low cost or even free. People have started asking the questions around ownership, both in terms of the licensing/copyright of the code these AI tools have been trained with, and the code they in turn generate. For now, companies like Microsoft will claim if you use an AI tool and in turn get challenged by a copyright holder for a violation, they've got your back, while at the same time what code the tool generates is your IP.. But for how long? Monsanto didn't start from scratch with the gene makeup of the crops it improved. Generations of botanists, farmers, etc. experience and cross-breeding had supplied the current base. Much of that was done for little more than recognition for their work, out in the public domain, expecting that future efforts for improvement would remain freely available. That is until big industry finds a way to patent it's work, commercialize it, and take ownership of it. When it comes to business, it is a lot like fishing. When you first sense a fish is biting at your bait, you need to resist the urge to yank the line or you pull the hook out of its mouth. Fish can be clever and grab the edge of a bait and waiting for a free meal. No, instead you wait patiently until the fish is committed, then pull and set the hook. Once they could convince the patent office and courts, the hook was firmly set.

Today, companies like Microsoft have invested a good deal of money into developing and marketing these AI tools. They are putting their lines out in the water with tasty bait offering to help companies and development teams produce better quality code and products faster than ever. They are being patient, not to spook the fish. Today you own what the tool generates, but soon, I'd wager, companies like Microsoft will set that hook, and like Monsanto, demand their share of the value of the yield you produce directly, or indirectly from their seed; By force. How that exactly shapes up, we'll have to see. Perhaps terms that once you start development with AI assistance you cannot "opt out"? Or will they demand part ownership on the basis that their tools generated a share of the IP in the end product?

Thursday, August 21, 2025

Why I don't Grok AI

 I'm a bit of a dinosaur when it comes to software development. I've been on the rollercoaster chasing the highs from working with a new language or new toolset. I've ridden through the lows when a technology I happen to really enjoy working with ends up getting abandoned. (Silverlight, don't get started, for a website? Never, for Intranet web apps? Chef's kiss)

I'm honestly not that worried about AI tools in software development. As an extension to existing development tools, if it makes your life simpler, all the more power to you. Personally I don't see myself ever using it for a few reasons. One reason is the same as why I don't use tools like Resharper. You know, the 50,000 different hotkey combinations that can insert templated code, etc. etc. The reason I don't use tools like that is because for me, coding is about 75% thinking and 25% actual writing. I don't like to, nor want to write code faster because I don't need to. Often in thinking about code I realize better ways to do it, or in some cases, that I don't actually need that code at all. Having moar code fast can be overwhelming. Sure, AI tools are trained (hopefully) on best practices and should theoretically produce better code the first time around, not needing as much re-factoring, but the time to think and tweak is valuable to me. It's a bit like the tortoise and the hare. Someone with AI assistance will probably produce a solution far faster than someone without one, but at the end of the day, what good is speed if you're zipping along producing the wrong solution? Call me selfish but I also think any developer should see the writing on the wall that if a tool saves them 50% of their time, employer expectations are going to be pushing for 100% more work out of them in a day. 

The second main reason I don't see myself using AI is when it comes to stuff I don't know, or need to brush back up on, I want to be sure I fully understand the code I am responsible for, not just requesting something from an LLM. Issues like "impostor syndrome" are already a problem in many professions. I don't see the situation getting anything but worse when a growing portion of what you consider "employment" is feeding and changing the diapers on a bot. I have the experience behind me to be able to look at the code an LLM generates and determine whether it's fit for purpose, or the model's been puffing green dragon. What somewhat scares me is the idea of "vibe coding" where people that don't really understand coding use LLMs in a form of trial and error to get a solution done. Building a prototype? Great idea. Something you're going to convince people or businesses to actually use with sensitive data or decisions with consequences? Bad, bad idea.

Personally I see the value in LLM-based code generation plateauing rather quickly in terms of usefulness. It will get better, to a point, as it continues to learn from samples and corrections written and reviewed by experienced software developers. However, as Github starts to get filled by AI-generated code, and sites like StackOverflow die off with the new generation of developer consulting LLMs for guidance and "get this working for me" rather than "explain why this doesn't work", the overall quality of generated code will start to slip. With luck it's noticeable before major employers dive all-in, giving up on training new developers to understand code & problem solve, and all of us dinosaurs retire.

Until then I look forward to lucrative contracts sorting out messes that greenhorns powered by ChatGPT get themselves into. ;)

Autofac and Lazy Dependency Injection: 2025 edition

 Thank you C#!  I couldn't believe it's been two years since I last posted about my lazy dependency implementation. Since that time there have been a few updates to the C# language, in particular around auto-properties that have greatly simplified the use of lazy dependencies and their property overrides to simplify unit testing. I also make use of primary constructors, which are ideally suited to the pattern since unlike regular constructor injection, the assertions happen in the property accessors, not a constructor.

The primary goal of this pattern is still to leverage lazy dependency injection while making it easier to swap in testing mocks. Classes like controllers can have a number of dependencies that they use, but depending on the action and state passed in, many of those dependencies don't actually get used in all situations. Lazy loading dependencies sees dependencies initialized/provided only if they are needed, however this adds a layer of abstraction to access the dependency when it's needed, and complicates mocking the dependency out for unit tests a tad more ugly.

The solution which I call lazy dependencies +property mitigates these two issues. The property accessor handles the unwrapping of the lazy proxy to expose the dependency for the class to consume. It also allows for a proxy to be injected. Each lazy dependency in the constructor is optional. If the IoC Container doesn't provide a new dependency or a test does not mock a referenced dependency, the property accessor throws a for-purpose DependencyMissingException to note when a dependency was not provided.

Updated pattern:

 public class SomeClass(Lazy<ISomeDependency>? _lazySomeDependency = null)
 {
     [field: MaybeNull]
     public ISomeDependency SomeDependency
     {
         protected get => field ??= _lazySomeDependency?.Value ?? throw new DependencyMissingException(nameof(SomeDependency));
         init;
     }
 }

This is considerably simpler than the original implementation. We can use the primary constructor syntax since we do not need to assert whether a dependency was injected or not. Under normal circumstances all lazy dependencies will be injected by the container, but asserting them falls on the property accessor. No code, save the accessor property, should attempt to access dependencies through the lazy dependency. The auto property syntax, new to C# gives us access to the "field" keyword. We also leverage a public init setter so that our tests can inject mocks for any dependencies they will use, while the getter remains protected (or private) for accessing the dependency within the class. The dependency property will look for an initialized instance, then check the lazy injected source, before raising an exception if the dependency has not been provided.

Unit tests provide a mock through the init setter rather than trying to mock a lazy dependency:

Mock<ISomeDependency> mockDependency = new();
mockDependency.Setup(x => /* set up mocked scenario */);

var classUnderTest = new SomeClass
{
    SomeDependency = mockDependency.Object
};

// ... Test behaviour, assert mocks.

In this simple example it will not look particularly effective, but in controllers that have several, maybe a dozen dependencies, this can significantly simplify test initialization. If a test scenario is expected to touch 3 out of 10 dependencies then you only need to provide mocks for the 3 dependencies rather than always mocking all 10 for every test.  If internal code is updated to touch a 4th dependency then the test(s) will break until they are updated with suitable mocks for the extra dependency. This allows you to mock only what you need to mock, and avoid silent or confusing failure scenarios when catch-all defaulted mocks try responding to scenarios they weren't intended to be called upon.

Monday, January 23, 2023

Autofac and Lazy Dependency Injection

 Autofac's support for lazy dependencies is, in a word, awesome. One issue I've always had with constructor injection has been with writing unit tests. If a class has more than a couple of dependencies to inject, you quickly run into situations where every single unit test needs to be providing Mocks for every dependency even though a given test scenario only needs to configure one or two dependencies to satisfy the testing.

A pattern I had been using in the past with introducing Autofac into legacy applications that weren't using dependency injection was using Autofac's lifetime scope to serve as a Service Locator, so that the only breaking change was injecting the Autofac container in the constructors, then using properties that would resolve themselves on first access. The Service Locator could even be served by a static class if injecting was problematic with teams, but I generally advise against that to avoid the Service Locator being used willy-nilly everywhere which becomes largely un-test-able. What I did like about this pattern is that unit tests would always pass a mocked Service Locator that would throw on a request for a dependency, and would be responsible for setting the dependencies via their Internally visible property-based dependencies. In this way a class might have six dependencies with a constructor that just includes the Service Locator. When writing a test that is expected to need two of those dependencies, it just mocks the required dependencies and sets the properties rather than injecting those two mocks and four extra unused mocks.

An example of the Service Locator design:

private ISomeDependency? _someDependency = null;
public ISomeDependecy SomeDependency
{
    get => _someDependency ??= _scope.Resolve<ISomeDependency>()
        ?? throw new ArgumentException("The SomeDependency dependency could not be resolved.");
    set => _someDependency = value;
}

public SomeClass(IContainer container)
{
    if (container == null) throw new ArgumentNullException(nameof(container));
    _scope = container.BeginLifetimeScope();
}

public void Dispose()
{
    _container.Dispose();
}


This pattern is a trade-off of reducing the impact of re-factoring code that doesn't use dependencies to introduce IoC design to make the code test-able. The issue to watch out for is the use of the _scope LifetimeScope. The only place the _scope reference should ever be used is in the dependency properties, never in methods etc.  If I have several dependencies and write a test that will use SomeDependency, all tests construct the class under test with a common mock Container that will provide a mocked LifetimeScope that will throw on Resolve, then use the property setters to initialize the dependencies used with mocks. If code under test evolves to use an extra dependency, the associated tests will fail as the Service Locator highlights requests for an unexpected dependency.

The other advantage this gives you is performance. For instance with MVC controllers and such handling a request. A controller might have a number of actions and such that have touch-points on several dependencies over the course of user requests. However, an individual request might only "touch" one or a few of these dependencies. Since many dependencies are scoped to a Request, you can optimize server load by having the container only resolve dependencies when and if they are needed. Lazy dependencies.

Now I have always been interested in reproducing that ease of testing flexibility and ensuring that dependencies are only resolved if and when they are needed when implementing a proper DI implementation. Enter lazy dependencies, and a new example:

private readonly Lazy<ISomeDependency>? _lazySomeDependency = null;
private ISomeDependency? _someDependency = null;
public ISomeDependency SomeDependency
{
    get => _someDependency ??= _lazySomeDependency?.Value
        ?? throw new ArgumentException("SomeDependency dependency was not provided.");
    set => _someDependency = value;
}

public SomeClass(Lazy<ISomeDependency>? someDependency = null)
{
     _lazySomeDependency = someDependency;
}

This will look similar to the above Service Locator example except this would be operating with a full Autofac DI integration. Autofac provides dependencies using Lazy references which we default to #null allowing tests to construct our classes under test without dependencies. Test suites don't have a DI provider running so they will rely on using the setters for initializing mocks for dependencies to suit the provided test scenario. There is no need to set up Autofac to provide mocks or be configured at all such as with the Service Locator pattern.

In any case, this is something I've found to be extremely useful when setting up projects so I thought I'd throw up an example if anyone is searching for options with Autofac and lazy initialization.

Thursday, May 30, 2019

JavaScript: Crack for coders.

I've been cutting code for a long time. I was educated in Assembler, Pascal, and C/C++. I couldn't find local work doing that so I taught myself Access, Visual Basic, T-SQL, PL-SQL, Delphi, Java, Visual C++, and C# (WinForms, WebForms, MVC, Web API, WPF, and Silverlight). It wasn't until I really started to dive into JavaScript that programming started to feel "different".

With JavaScript you get this giggly-type high when you're sitting there in your editor with a dev server running on the other screen tinkering away and the code "just works". You can incrementally build up your application with "happy little accidents" just like a Bob Ross painting. WebPack and Babel automatically wrangled the vast conflicting standards for modules, ES versions, and such into something that actually runs on most browsers. React's virtual DOM makes screen updates snappy, MobX is managing your shared state without stringing together a range of call-backs and Promises. And there is so much out there to play with and experiment. Tens of thousands of popular packages on npm waiting to be discovered.

But those highs are separated by often lingering sessions testing your Google-fu trying to find current, relevant clues on just why the hell your particular code isn't working, or how to get that one library you'd really like to use to import properly into your ES6 code without requiring you to eject your Create-React-App application. You get to grit your teeth with irritations like when someone managing moment.js decided that add(number, period) made more sense than add(period, number) while all of the examples you'd been using had it still the other way around. Individually, these problems seem trivial, but they really add up quick when you're just trying to get over that last little hurdle between here and your next high.


JavaScript development is quite literally Crack for coders.

Monday, May 20, 2019

YAGNI, KISS, and DRY. An uneasy relationship.

As a software developer, I am totally sold on S.O.L.I.D. principles when it comes to writing the best software. Developers can get pedantic about one or more of these principles, but in general following them is simple and it's pretty easy to justify their value in a code base. Where things get a bit more awkward is when discussing three other principles that like to hang out in coding circles:

YAGNI - You Ain't Gonna Need It
KISS - Keep It Stupidly Simple
DRY - Don't Repeat Yourself

YAGNI, he's the Inquisitor. In every project there are the requirements that the stakeholders want, and then there are the assumptions that developers and non-stakeholders come up with in anticipation of what they think stakeholders will want.YAGNI sees through these assumptions and expects every feature to justify it's existence.

KISS is the lazy one in the relationship. KISS doesn't like things complicated, and prefers to do the simplest thing because it's less work, and when it has to come back to an area of a project later on, it wants to be sure that it's able to easily understand and pick the code back up.  KISS gets along smashingly with YAGNI because the simplest, easiest thing is not having to write anything at all.

DRY on the other hand is like the smart-assed, opinionated one in a relationship. DRY wants to make sure everything is as optimal as possible. DRY likes YAGNI, and doesn't get along with KISS and often creates friction in the relationship to push KISS away. The friction comes when KISS advocates doing the "simplest" thing and that results in duplication. DRY sees duplication and starts screaming bloody murder.

Now how do you deal with these three in a project? People have tried taking sides, asking which one trumps the other. The truth is that all three are equally important, but I feel that timing is the problem when it comes to dealing with KISS and DRY.  When teams introduce DRY too early in the project you get into fights with KISS, and can end up repeating your effort even if your intention wasn't to repeat code.

YAGNI needs to be involved in every stage of a project. Simply do not ever commit to building anything that doesn't need to be built. That's an easy one.

KISS needs to be involved in the early stages of a project or set of features. Let KISS work freely with YAGNI and aim to make code "work".  KISS helps you vet out ideas in the simplest way and ensure that the logic is easy to understand and easy to later optimize.

DRY should be introduced into the project no sooner that when features are proven and you're ready to optimize the code. DRY is an excellent voice for optimization, but I don't believe it is a good voice to listen to early within a project because it leads to premature optimization.

Within a code base, the more code you have, the more work there is to maintain the code and the more places there are for bugs to hide. With this in mind, DRY would seem to be a pretty important element to listen to when writing code. While this is true there is one important distinction when it comes to code duplication. Code should be consolidated only where the behavior of that code is identical. Not similar, but identical.  This is where DRY trips things up too early in a project. In the early stages of a project, code is highly fluid as developers work out how best to meet requirements that are often still being fleshed out. There is often a lot of similar code that can be identified, or code that you "expect" to be commonly used so DRY is whispering to centralize it, don't repeat yourself!

The problem is that when you listen to DRY too early you can violate YAGNI unintentionally by composing structure you don't need yet, and make code more complex than it needs to be. If you optimize "similar" code too early you either end up with conditional code to handle the cases where the behaviour is similar but not identical, or you work to try and break down the functionality so atomically that you can separate out the identical parts. (Think functional programming) This can often make code a lot more complex than it needs to be too early in the development leading to a lot more effort as new functionality is introduced. You either struggle to fit with the pre-defined pattern and restrictions, or end up working around it entirely, violating DRY.

Another risk of listening to DRY too early is when it steers development heavily towards inheritance rather than composition. Often I've seen projects that focus too heavily and too early on DRY start to architect systems around multiple levels of inheritance and generics to try and produce minimalist code. The code invariably proves to be complex and difficult to work with. Features become quite expensive to work with because they cause use cases that don't "fit" with previous preconceptions of how related code was written to be re-used. This leads to complex re-write efforts or work-arounds.

Some people argue that we should listen to DRY over KISS because DRY enforces the Single Responsibility Principle from S.O.L.I.D.  I disagree with this in many situations because DRY can violate SRP where consolidated code can now have two or more reasons to change.  Take generics for example. A generic class should represent identical functionality across multiple types. That is fine, however you can commonly find a situation where a new type could benefit from the generic except...   And that "except" now poses a big problem. A generic class/method violates SRP because it's reason for change is governed by every class that leverages it. So long as the needs of those classes is identical we can ignore the potential implications against SRP, but as soon as those needs diverge we either violate SRP, violate DRY with nearly duplicate code or further complicate the code to try and keep every principle happy. Functional code conforms to SRP because it does one thing, and does it well. However, when you aren't working in a purely functional domain, consolidated code serving a part of multiple business cases now potentially has more than one reason to change. Clinging to DRY can, and will make your development more costly as you struggle to ensure new requirements conform.

Personally, I listen to YAGNI and KISS in the early stages of developing features within a project, then start listening to DRY once the features are reasonably established. DRY and KISS are always going to butt heads, and I tend to take KISS's side in any argument if I feel like that area of the code may continue to get attention or that DRY will risk making the code too complex to conveniently improve or change down the track.  I'll choose simple code that serves one purpose and may be similar to other code serving another purpose over complex code that satisfies an urge not to see duplication, but serves multiple interests that can often diverge.