Forget SOLID. Say Hello To SHOC Principles for Modular Design.

Yesterday, I ran a little experiment on social media. Like TDD, SOLID “class design principles” have become de rigueur on software developers’ CVs. And, like TDD, most of the CVs SOLID is featured on belong to people who evidently haven’t understood – let alone applied – the principles.

There’s also been much talk in recent years about whether SOLID principles need updating. Certainly, as a trainer and coach in software design practices, I’ve found SOLID to be insufficient and often confusing.

So, with both these things in mind, I threw out my own set of principles – SHOC – to explain how Codemanship teaches principles of modular software design. Note that I’m very deliberately not saying “object-oriented” or “class” design.

Here are the four principles taught on Codemanship Code Craft courses. Good modules should:

  • Be Swappable
  • Hide their internal workings
  • Do One job
  • Have Client-driven interfaces

Now, those of you who do understand SOLID may notice that SHOC covers 4 of the 5 letters.

The Liskov Substitution and Dependency Inversion principles in SOLID are about making modules interchangeable (basically, about polymorphism).

The Single Responsibility Principle is essentially “do one job”, though it’s rationale I’ve long found to be a red herring. The real design benefit of SRP is greater composability (think of UNIX pipes), so I focus on that when I explain it.

The Interface Segregation Principle is a backwards way of saying that interfaces should be designed from the client’s point of view.

So that’s the S, the L, the I and the D of SOLID. SLID.

But SOLID is missing something really rather crucial. It doesn’t explicitly mandate encapsulation. Combined, it may imply it, depending on how you interpret and apply the principles.

But you can easily satisfy every SOLID – or SLID – principle and still have every module expose its internals in a very unseemly manner, with getters and setters galore. e.g., Interface Segregation says that’s fine just as long as only the getters and setters the client’s using are exposed.

So I find the need to add Tell, Don’t Ask – that objects shouldn’t ask for data to do work, they should tell the objects that contain the data to do the work themselves, enabling us to hide that data – to the list of modular design principles developers need to learn. SLIDT.

And what happened to the Open-Closed Principle of SOLID – that classes should be open to extension and closed to modification. The original rationale for the OCP was the time it took to build and test code. This is a callback to a time when our computers were about 1000x slower than today. I used to take a long coffee break when my code was compiling, and our tests ran overnight. And that was advanced at the time.

Now we can build and test our code in minutes or even seconds – well, if our testing pyramid is the right way up – and modifying existing modules really is no big deal. The refactoring discipline kind of relies on modules being open to modification, for example.

And, as Kevlin Henney has rightly pointed out, we could think of OCP as being more of a language design principles than a software design principle. “Open to extension” in C++ means something quite different in JavaScript.

So I dropped the O in SOLID. It’s a 90’s thing. Technology moved on.

“SLIDT”**, of course, doesn’t exactly trip off the tongue, and I doubt many would use it as their “memorable information” for their online banking log-in.

So I came up with SHOC. Without a K*. Modules should be swappable, hide their internal workings, do one job and have interfaces designed from the client’s point of view.

This is how I teach modular design now – in multiple programming paradigms and at multiple levels of code organisation – and I can report that it’s been far more successful when you measure it in terms of the impact on code quality it’s had.

It’s much easier to understand, and much easier to apply, be it in C++, or Java, or C#, or JavaScript, or Clojure, or COBOL. Yes, you heard me. COBOL.

SHOC is built on the original principles of modular design dating back to the late 1960s and early 70s – namely that modules should be interchangeable, have good separation of concerns, present “well-defined” interfaces (okay, so I interpreted that) and hide their information. Like Doctor Martin’s boots, I’m not expecting these to go out of fashion anytime soon.

When I teach software design principles, I teach them as two sets: Simple Design and SHOC. Occasionally, students ask “But what about SOLID?” and we have that conversation – but increasingly less often of late.

So, will you be adding “SHOC” to your CV? Probably not. That was never the point. SOLID as a “must-have” skill will soldier on, despite being out-of-date, rarely correctly applied, and widely misunderstood.

But that’s never stopped us before 😉

*It’s been especially fun watching people try to add an arbitrary K to SHOC. Nature abhors an incomplete mnemonic.

**It just occurred to me that if I’d used “encapsulates its internal workings” instead of “Tell, Don’t Ask”, I could have had SLIDE, but folk would still get confused by the SLID part

A Vision for Software Development: It’s All About Teams

I’ve thought a lot in recent years about how our profession is kind of fundamentally broken, and how we might be able to fix it.

The more I consider it, the more I think the underlying dysfunction revolves around software development teams, and the way they’re perceived as having only transient value.

Typically, when a business wants some new software, it builds a team specifically to deliver it. This can take many months and cost a lot of money. First, you have to find the people with the skills and experience you need. That in itself usually works out expensive – to the tune of tens of thousands of pounds per developer – before you’ve paid them a penny.

But the work doesn’t end there. Once you’ve formed your team, you then need to go through the “storming and norming” phases of team-building, during which they figure out how to work together. This, too, can work out very expensive.

So a formed, stormed and normed software team represents a very significant investment before you get a line of working code.

And, as we know, some teams never get past the forming stage, being stuck permanently in storming and norming and never really finding a satisfactory way to move forward together as they all pull in different directions.

The high-performing teams – the ones who work well together and can deliver good, valuable working software – are relative rarities, then: the truffles of the software industry.

Indeed, I’ve seen on many occasions how the most valuable end product from a software development effort turned out to be the team itself. They work well together, they enjoy working together, and they’re capable of doing great work. It’s just a pity the software itself was such a bad idea in the first place.

It seems somewhat odd then that businesses are usually so eager to break up these teams as soon as they see the work is “done”. It’s a sad fact of tech that the businesses who rely on the people who make it prefer to suffer us for as short a time as possible.

And this is where I think we got it wrong: should it be up to the customer to decide when to break up a high-performing dev team?

I can think of examples where such teams seized the day and, upon receiving their marching orders, set up their own company and bid for projects as a team, and it’s worked well.

This is very different to the standard model of development outsourcing, where a consultancy is effectively just a list of names of developers who might be thrown together for a specific piece of work, and then disbanded just as quickly at the end. Vanishingly few consultancies are selling teams. Most have to go through the hiring and team-building process themselves to fulfil their bids, acting as little more than recruitment agencies – albeit more expensive ones.

But I can’t help thinking that it’s teams that we should be focusing on, and teams our profession should be organising around:

  • Teams as the primary unit of software delivery
  • Teams as the primary commercial unit, self-organising and self-managing – possibly with expert helps for accounts and HR etc. Maybe it’s dev teams who should be outsourcing?
  • Teams as the primary route for training and development in our profession – i.e., through structured long-term apprenticeships

I have a vision of a software development profession restructured around teams. We don’t work for big companies who know nothing about software development. We work in partnerships that are made up of one or more teams, each team member specialised enough for certain kinds of work but also generalised enough to handle a wide range of work.

Each team would take on a small number of apprentices, and guide and mentor them – investing in training and development over a 3-5 year programme of learning and on-the-job experience – to grow the 10% of new developers our industry needs each year.

Each team would manage itself, work directly with customers. This should be part of the skillset of any professional developer.

Each team would make its own hiring decisions when it feels it needs specialised expertise from outside, or needs to grow (although my feelings on team size are well known), or wants to take on apprentices. So much that’s wrong with our industry stems from hiring decisions being taken by unqualified managers – our original sin, if you like.

And, for sure, these teams wouldn’t be immutable forever and all time. There would be an organic process of growth and change, perhaps of splitting into new teams as demand grows, and bringing in new blood to stop the pond from stagnating. But, just as even though pretty much every cell in my body’s been replaced many times but I’m somehow still recognisably me, it is possible with ongoing change to maintain a pattern of team identity and cohesion. There will always be a background level of forming, storming and norming. The trick is to keep that at a manageable level so we can keep delivering in the foreground.

There’s No Such Thing As “Agile”. There’s Just Software Development.

Okay, so this Sunday morning rant’s been a long time coming. And, for sure, I’ve expressed similar sentiments before. But I don’t think I’ve ever dedicated a whole blog post to this, so here goes. You may want to strap in.

20 years ago, a group of prominent software folk gathered at a ski resort in Utah to fix software development. Undoubtedly, it had become broken.

Broken by heavyweight, command-and-control processes. Broken by unrealistic and oftentimes downright dishonest plan-driven management that tried to impose the illusion of predictability to something that’s inherently unpredictable. Broken by huge outsourced teams and $multi-million – sometimes even $multi-billion – contracts that, statistically, were guaranteed to fail, crushed by their own weight. Broken by the loss of basic technical practices and the influx of low-skilled programmers to fuel to the first dotcom boom, all in the name of ballooning share prices of start-ups – many of which never made a red cent of profit.

All of this needed fixing. The resulting Manifesto for Agile Software Development attempted to reset the balance towards lightweight, feedback-driven ways of working, towards smaller, self-organising teams, towards continuous and rich face-to-face communication, and towards working software as the primary measure of progress.

Would that someone in that room had been from a marketing and communications background. A fundamental mistake was made at that meeting: they gave it a name.

And so, Agile Software Development became known as “another way of doing software development”. We could choose. We could be more Agile (with a capital “A”). Or, we could stick with our heavyweight, command-and-control, plan-driven, document-driven approach. Like Coke Zero and Original Coke.

The problem is that heavyweight, command-and-control, plan-driven, document-driven approaches tend to fail. Of course, for the outsourcing companies and the managers, they succeed in their underlying intention, which is to burn through a lot of money before the people signing the cheques realise. Which is why that approach still dominates today. I call it Mortgage-Driven Development. You may know it as “Waterfall”.

But if we measure it by tangible results achieved, Mortgage-Driven Development is a bust. We’ve known throughout my entire lifetime that it’s a bust. Winston Royce warned us it was a bust in 1970. No credible, informed commentator on software development has recommended we work that way for more than 50 years.

And yet, still many do. (The main difference in 2021 being that a lot of them call it “Agile”. Oh, the irony.)

How does Mortgage-Driven Development work, then? Well – to cut a long story short – badly, if you measure it by tangible customer outcomes like useful working software and end user problems being solved. If you measure it by the size of developers’ houses, though, it works really, really well.

MDD works from a very simple principle – namely that our customer shouldn’t find out that we’ve failed until a substantial part of our mortgage has been paid off. The longer we can delay the expectation of seeing working software in production, the more of our mortgage we can pay off before they realise there is no working software that can be released into production.

Progress in MDD is evidenced by documentation. The more of it we generate, the more progress is deemed to have been achieved. I’ve had to drag customers kicking and screaming to look at actual working software. But they’re more than happy to look at a 200-page architecture document purporting to describe the software, or a wall-sized Gantt chart with a comforting “You are here” to make the customer think progress has actually been made.

Of course, when I say “more than happy to look at”, they don’t actually read the architecture document – nobody does, and that includes the architects who write them – or give the plan anything more than a cursory glance. They’re like a spare tire in the boot of your car, or a detailed pandemic response plan sitting on a government server. There’s comfort in knowing it merely exists, even if – when the time comes – they are of no actual use.

Why customers and managers don’t find comfort in visible, tangible software is anybody’s guess. It could come down to personality types, maybe.

Teams who deliver early and often present the risk of failing fast. I took over a team for a small development shop who had spent a year going around in circles with their large public sector client. No software had been delivered. With me in the chair, and a mostly new team of “Agile” software developers, we delivered working software within three weeks from a standing start (we even had to build our own network, connected to the Internet by my 3G dongle). At which point, the end client decided this wasn’t working out, and canned the contract.

That particular project lives in infamy – recruiters would look at my CV and say “Oh, you worked on that?” It was viewed as failure. I view it as a major success. The end client paid for a year’s worth of nothing, and because nothing had been delivered, they didn’t realise it had already failed. They’d been barking up entirely the wrong tree. It took us just three weeks to make that obvious.

Saving clients millions of pounds by disproving their ideas quickly might seem like a good thing, but it runs counter to the philosophy of Mortgage-Driven Development.

I’ve been taken aside and admonished for “actually trying to succeed” with a software project. Some people view that as risky, because – in their belief system – we’re almost certainly going to fail, and therefore all efforts should be targeted at billing as much as possible and at escaping ultimate blame.

And, to me, this thing called Agile Software Development has always essentially just been “trying to succeed at delivering software”. We’re deliberately setting out to give end users what they need, and to do it in a way that gives them frequent opportunities to change their minds – including about whether they see any value in continuing.

The notion that we can do that without frequent feedback from end users trying working software is palpable nonsense – betting the farm on a proverbial “hole in one”. Nature solved the problem of complex system design, and here’s a heads-up: it isn’t a design committee, or a Gantt chart, or a 200-page architecture document.

Waterfall doesn’t work and never did. Big teams typically achieve less than small teams. Command-and-control is merely the illusion of control. Documents are not progress. And your project plan is a fiction.

When we choose to go down that road, we’re choosing to live in a lie.

Fast-Running Tests Are Key To Agility. But How Fast Is ‘Fast’?

I’ve written before about how vital it is to be able to re-test our software quickly, so we can ensure it’s always shippable after every change we make to the code.

Achieving a fast-running test suite requires us to engineer our tests so that the vast majority run as quickly as possible, and that means most of our tests don’t involve any external dependencies like databases or web services.

If we visualise our test suites as a pyramid, the base of the pyramid – the bulk of the tests – should be these in-memory tests (let’s call them ‘unit tests’ for the sake of argument). The tip of the pyramid – the slowest running tests – would typically be end-to-end or system tests.

But one person’s “fast” is often another person’s “slow”. It’s kind of ambiguous as to what I and others mean when we say “your tests should run fast”. For a team relying on end-to-end tests that can take many seconds to run, 100ms sounds really fast. For a team relying on unit tests that take 1 or 2 milliseconds, 100ms sounds really slow.

A Twitter follower asked me how long a suite of 5,000 tests should take to run? If the test suite’s organised into an ideal pyramid, then – and, of course, these are very rough numbers based on my own experience – it might look something like this:

  • The top of the pyramid would be end-to-end tests. Let’s say each of those takes 1 second. You should aim to have about 1% of your tests be end-to-end tests. So, 50 tests = 50s.
  • The middle of the pyramid would be integration and contract tests that check interactions with external dependencies. Maybe they each take about 100ms to run. You should aim to have less than 10% of those kinds of tests, so about 500 tests = 50s.
  • The base of the pyramid should be the remaining 4450 unit tests, each running in roughly 1-10ms. Let’s take an average of 5ms. 4450 unit tests = 22s.

You’d be in a good place if the entire suite could run in about 2 minutes.

Of course, these are ideal numbers. But it’s the ballpark we’re interested in. System tests run in seconds. Integration tests run in 100s of milliseconds. Unit tests run in milliseconds.

It’s also worth bearing in mind you wouldn’t need to run all of the tests all of the time. Integration and contract tests, for example, only need to be run when you’ve changed integration code. If that’s 10% of the code, then we might need to run them 10% of the time. End-to-end tests might be run even less frequently (e.g., in CI).

Now, what if your pyramid was actually a diamond shape, with the bulk of your tests hitting external processes? Then your test suite would take about 8 minutes to run, and you’d have to run those integration tests 90% of the time. Most teams would find that a serious bottleneck.

And if your pyramid was upside-down, with most tests being end-to-end, then you’re looking at 75 minutes for each test run, 90% of the time. I’ve seen those kind of test execution times literally kill businesses with their inability to evolve their products and systems.

Training Is Expensive. But Not As Expensive As Not Training.

However they choose to learn – from books, videos, blogs, online courses, instructor-led training, etc – by far the biggest cost in building a developer’s skills is the time that it takes.

I’ve worked with several thousand developers in more than 100 organisations over the last decade, so I have a perspective on how much time is really required. If you’re a manager, you may want to sit down for this.

Let’s start from scratch – a newly-minted programmer, just starting out in their dev career. They may have been programming for 6-12 months, perhaps at school or in a code club, or at home with a couple of books.

At this point, they’re a long way from being competent enough to be left alone to get on with writing real software for real end users in a real business. Typically, we find that it takes another 2-3 years. Before then, they’re going to need a lot of supervision – almost continuous – from a more experienced developer.

Of course, you could just leave them to their own devices, freeing up that more productive mentor to write their own code. But we know that the cost of maintaining code over its lifetime is an order of magnitude higher than the cost of writing it in the first place. An inexperienced developer’s code is likely to be far less maintainable, and therefore cost far more to live with.

This is the first hidden cost of learning, and it’s a big one.

But it’s not just about maintainability of code, of course. Inexperienced developers are less likely to know how to pin down requirements, and therefore more likely to build the wrong things, requiring larger amounts of rework. And this is rework of code that’s harder to change, so it’s expensive rework.

More mature organisations recognise this, and invest more to get their developers up to speed sooner. (Many developers, sadly, never learn to write maintainable code at any point in their career – it’s pot luck if you happen to end up being exposed to good practices).

Or you could exclusively hire more experienced developers, of course. But that plan has two fatal flaws. Firstly, hiring developers is very expensive and takes months. Secondly, if nobody hires inexperienced developers, where will these experienced developers come from?

So, you end up paying the piper one way or another. You can pay him for training. Or you can pay him for constant supervision. Or you can pay him for bug fixes and rework. Or you can pay him to try and recruit senior developers.

It turns out that training – I mean, really training – your developers is the cheapest option. It’s also the option least chosen.

On paper, it sounds like a huge investment. Some development organisations spend as much as 25% of their entire budget on learning and improving. Most organisations balk at this. It’s too much!

The lion’s share of this manifests in the developers’ time. They might, for example, give developers one day a week dedicated to learning and improving (and, as they become more senior, researching and experimenting). For a team of 6 developers, that adds up to £140,000 a year of developer time.

They might send teams on training courses. A group of 12 – the average Codemanship class size – on a 3-day course represents approximately £16,000 of dev time here in London.

These are some pretty big numbers. But only when you consider them without the context of the total you’re spending on development, and more importantly, the return your organisation gets from that total investment.

I often see organisations – of all sizes and shapes – brought to their knees by legacy products and systems, and their inability to change them, and therefore to change the way they do business.

Think not about the 25%. Think instead about what you’re getting from the other 75%.

I’m A Slacker, And Proud Of It

That probably sounds like an unwise thing to put on your CV, but it’s nevertheless true. I deliberately leave slack in my schedule. I aim not to be busy. And that’s why I get more done.

As counterintuitive as it sounds, that’s the truth. The less I fill my diary, the more I’m able to achieve.

Here’s why.

Flash back to the 1990s, and picture a young and relatively inexperienced lead software developer. Thanks to years of social conditioning from family, from school, from industry, and from the media, I completely drank the Hussle Kool-Aid.

Get up early. Work, work, work. Meetings, meetings, meetings. Hussle, hussle, hussle. That’s how you get things done.

I filled my long work days completely, and then went home and read and practiced and learned and answered emails and planned for next work-packed day.

A friend and mentor recognised the signs. He recommended a I read a book called Slack: Getting Past Burnout, Busywork & The Myth of Total Efficiency by Tom DeMarco. It changed my life.

Around this time, ‘Extreme Programming’ was beginning to buzz on the message boards and around the halls of developer conferences. These two revelations came at roughly the same time. It’s not about how fast you can go in one direction – following a plan. It’s about how easily you can change direction, change the plan. And for change to be easy, you need adaptive capacity – otherwise known as slack.

Here was me as a leader:

“Jason, we need to talk about this urgent thing”

“I can fit you in a week on Thursday”

“Jason, should we look into these things called ‘web services’?”

“No time, sorry”

“Jason, your trousers are on fire”

“Send me a memo and I’ll schedule some time to look into that”

At an organisational level, lack of adaptive capacity can be fatal. The more streamlined and efficient they are at what they currently do, the less able they are to learn to do something else. Try turning a car at its absolute top speed.

At a personal level, the drive to be ever more efficient – to go ever faster – also has serious consequences. Aside from the very real risk of burning out – which ends careers and sometimes lives – it’s actually the dumbest way of getting things done. There are very few jobs left where everything’s known at the start, where nothing changes, and where just sticking to a plan will guarantee a successful outcome. Most outcomes are more complex than that. We need to learn our way towards them, adjusting as we go. And changing direction requires spare capacity: time to review, time to reflect, time to learn, time to adjust.

On a more serious note, highly efficient systems tend to be very brittle. Think of our rail networks. The more we seek to make them more efficient, the bigger the impact on the network when something goes wrong. If we have a service that takes 30 minutes to get from, say, London Waterloo to Surbiton, and we run it every hour, if there’s a delay, there’s 30 minutes of slack to recover in. The next train doesn’t have to be late. If we run it every 30 minutes – at maximum “efficiency” – there’s no wiggle room. The next train will be late, and the one after that, etc.

My days were kind of like that; if my 9am meeting overran, then I’d be late for my 9:20, and late for my 10am, and so on.

When we stretch ourselves and our systems to breaking point – which is what ‘100% efficiency’ really means – we end up being rigid (hard to change) and brittle (easy to break).

We’re seeing that now in many countries’ handling of the pandemic. After a decade of ideological austerity stripping away more and more resources from public services in the UK, forcing them to become ever more ‘efficient’, the appearance of the unexpected – though we really should have been expecting it at some point – has now broken many of those services, and millions of lives.

Since the late 90s, I’ve deliberately kept my diary loose. For example, I try very hard to avoid running two training courses in the same week. When someone else was managing my diary and my travel arrangements, they’d have me finishing work in one city and jumping on a late train or flight to the next city for another appointment the next morning. This went wrong very, very often. And there was no time to adjust at all. If you’ve ever tried to find a replacement laptop at 7am in a strange city, you’ll know what I’m talking about.

So I highly recommend reading Tom’s book, especially if you’re recognising the symptoms. And then you too can become a more productive slacker.

Beware False Trade-offs

Over the last 10 months we’ve seen how different governments have handled the COVID-19 pandemics in their own countries, and how nations have been impacted very differently as a result.

While countries like Italy, the United Kingdom and Belgium have more than 100 deaths per 100,000 of the population, places where governments acted much faster and more decisively, like New Zealand have a far lower mortality rate (in the case of NZ, 0.5 deaths per 100,000).

Our government made the argument that they had to balance saving lives with saving the economy. But this, it transpires, is a false dichotomy. In 2020, the UK saw GDP shrink by an estimated 11.3%. New Zealand’s economy actually grew slightly by 0.4%.

For sure, during their very stringent measures to tackle the virus, their economy shrank like everyone else’s. But having very effectively made their country COVID-free, it bounced back in a remarkable V-shaped recovery. Life in countries that took the difficult decisions earlier has mostly returned to normal. Shops, bars, restaurants, theatres and sports stadiums are open, and NZ is very much open for business.

The depressing fact is that countries like the UK made a logical error in trying to keep the economy going when they should have been cracking down on the spread of the virus. In March, cases were doubling roughly twice a week, and every week’s delay in acting cost four times as many lives. Delaying for 2 weeks in March meant that infection cases sored to a level that made the subsequent lockdown much, much longer. Hence there was a far greater impact on the economy.

Eventually, by early July, cases in the UK had almost disappeared. At which point, instead of doubling down on the measures to ensure a COVID-free UK, the government made the same mistake all over again. They opened everything up again because they mistakenly calculated that they had to get the economy moving as soon as possible.

Cases started to rise again – albeit at a slower rate this time, as most people were still taking steps to reduce risks of infection – and around we went a second time.

The next totally predictable – and totally predicted – lockdown again came weeks too late in November.

And again, as soon as they saw that cases were coming down, they reopened the economy.

We’re now in our third lockdown, and this one looks set to last until late Spring at the earliest. This time, we have vaccines on our side, and life will hopefully get back to relative normal in the summer, but the damage has been done. And, yet again, the damage is far larger than it needed to be.

50,000 families have lost their homes since March 2020. Thousands of businesses have folded. Theatres may never reopen, and city centres will probably never recover as home-working becomes the New Normal.

By trying to trade-off saving lives against the economy, countries like the UK have ended up with the worst of both worlds: one of the highest mortality rates in Europe, and one of the worst recessions.

You see, it’s not saving lives or saving the economy. It’s saving lives and saving the economy. The same steps that would have saved more lives would have made the lockdowns shorter, and therefore brought economic recovery faster.

Why am I telling you all this? Well, we have our own false dichotomies in software. The most famous one being the perceived trade-off between quality and time or cost.

An unwillingness to invest in, say, more testing sooner in the mistaken belief that it will save time leads many teams into deep water. Over three decades, I’ve seen countless times how this leads to software that’s both buggier and costs more to deliver and to maintain – the worst of both worlds.

The steps we can take to improve the quality of our software turn out to be the same steps that help us deliver it sooner, and maintain it for longer for less money. Time “wasted” writing developer tests, for example, is actually an order of magnitude more time saved downstream (where “downstream” could just as easily mean “later today” as “after release”).

But the urge to cut corners and do trade-offs is strong, especially in highly politicised environments where leaders are rarely thinking past the next headline (or in our case, the next meeting with the boss). It’s a product of timid leadership, and one-dimensional, short-term reasoning.

When we go by the evidence, we see that many trade-offs are nothing of the sort.

Big Test Set-Ups Don’t Necessarily Point to Design Problems

I was discussing what our test code can tell us about the design of our solutions this morning with a friend. It’s an interesting topic. The received wisdom is that big test set-ups mean that the class or module being tested has too many dependencies and is therefore almost certainly doing too much.

This is often the case, but not always. Let me illustrate with an example. Here’s an integration test for my Guitar Shack solution:

package com.guitarshack.integrationtests;
import com.guitarshack.*;
import com.guitarshack.net.RESTClient;
import com.guitarshack.net.RequestBuilder;
import com.guitarshack.net.Web;
import com.guitarshack.product.ProductData;
import com.guitarshack.sales.SalesData;
import com.guitarshack.sales.ThirtyDayAverageSalesRate;
import org.junit.Test;
import java.util.Calendar;
import static org.mockito.Matchers.any;
import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.verify;
/*
It's a good idea to have at least one test that wires together most or all
of the implementations of our interfaces to check that we haven't missed anything
*/
public class StockMonitorIntegrationTest {
@Test
public void alertShouldBeTriggered(){
Alert alert = mock(Alert.class);
StockMonitor monitor = new StockMonitor(
alert,
new ProductData(
new RESTClient(
new Web(),
new RequestBuilder())),
new LeadTimeReorderLevel(
new ThirtyDayAverageSalesRate(
new SalesData(
new RESTClient(
new Web(),
new RequestBuilder()
),
() > {
Calendar calendar = Calendar.getInstance();
calendar.set(2019, Calendar.AUGUST, 1);
return calendar.getTime();
}
)
)
)
);
monitor.productSold(811, 40);
verify(alert).send(any());
}
}

The set-up for this test is pretty big. Does that mean my StockMonitor class has too many dependencies? Let’s take a look.

public class StockMonitor {
private final Alert alert;
private final Warehouse warehouse;
private final ReorderLevel reorderLevel;
public StockMonitor(Alert alert, Warehouse warehouse, ReorderLevel reorderLevel) {
this.alert = alert;
this.warehouse = warehouse;
this.reorderLevel = reorderLevel;
}
public void productSold(int productId, int quantity) {
Product product = warehouse.fetchProduct(productId);
if(needsReordering(product, quantity))
alert.send(product);
}
private Boolean needsReordering(Product product, int quantitySold) {
return product.getStock() quantitySold <= reorderLevel.calculate(product);
}
}
view raw StockMonitor.java hosted with ❤ by GitHub

That actually looks fine to me. StockMonitor essentially does one job, and collaborates with three other classes in my solution. The rest of the design is hidden behind those interfaces.

In fact, the design is like that all the way through. Each class only does on job. Each class hides its internal workings behind small, client-specific interfaces. Each dependency is swappable by dependency injection. This code is highly modular.

When we look at the unit test for StockMonitor, we see a much smaller set-up.

public class StockMonitorTest {
@Test
public void alertSentWhenProductNeedsReordering() {
Alert alert = mock(Alert.class);
ReorderLevel reorderLevel = product1 > 10;
Product product = new Product(811, 11, 14);
Warehouse warehouse = productId > product;
StockMonitor monitor = new StockMonitor(alert, warehouse, reorderLevel);
monitor.productSold(811, 1);
verify(alert).send(product);
}
}

The nesting in the set-up for the integration test is a bit of clue here.

StockMonitor monitor = new StockMonitor(
alert,
new ProductData(
new RESTClient(
new Web(),
new RequestBuilder())),
new LeadTimeReorderLevel(
new ThirtyDayAverageSalesRate(
new SalesData(
new RESTClient(
new Web(),
new RequestBuilder()
),
() > {
Calendar calendar = Calendar.getInstance();
calendar.set(2019, Calendar.AUGUST, 1);
return calendar.getTime();
}
)
)
)
);

This style of object construction is what I call “Russian dolls”. The objects at the bottom of the call stack are injected into the objects one level up, which are injected into objects another level up, and so on. Each object only sees its direct collaborators, and the lower layers are hidden behind their interfaces.

This is a natural consequence of the way I test-drove my solution: from the outside in, solving one problem at a time and using stubs and mocks as placeholders for sub-solutions.

So the big set-up in my integration test is not a sign of a class that’s doing too much and a lack of separation of concerns, and that’s because it’s a “Russian dolls” set-up. if it was a “flat set-up”, where every object is passed in as a direct parameter of StockMonitor‘s constructor, then that would surely be a sign of StockMonitor doing too much.

So, big set-up != lack of modularity in certain cases. What about the other way around? Does a small set-up always mean no problems in the solution design?

Before Christmas I refuctored my Guitar Shack solution to create some practice “legacy code” for students to stretch their refactoring skills on.

public class StockMonitor {
private final Alert alert;
public StockMonitor(Alert alert) {
this.alert = alert;
}
public void productSold(int productId, int quantity) {
String baseURL = "https://6hr1390c1j.execute-api.us-east-2.amazonaws.com/default/product";
Map<String, Object> params = new HashMap<>() {{
put("id", productId);
}};
String paramString = "?";
for (String key : params.keySet()) {
paramString += key + "=" + params.get(key).toString() + "&";
}
HttpRequest request = HttpRequest
.newBuilder(URI.create(baseURL + paramString))
.build();
String result = "";
HttpClient httpClient = HttpClient.newHttpClient();
HttpResponse<String> response = null;
try {
response = httpClient.send(request, HttpResponse.BodyHandlers.ofString());
result = response.body();
} catch (IOException | InterruptedException e) {
e.printStackTrace();
}
Product product = new Gson().fromJson(result, Product.class);
Calendar calendar = Calendar.getInstance();
calendar.setTime(Calendar.getInstance().getTime());
Date endDate = calendar.getTime();
calendar.add(Calendar.DATE, 30);
Date startDate = calendar.getTime();
DateFormat format = new SimpleDateFormat("M/d/yyyy");
Map<String, Object> params1 = new HashMap<>(){{
put("productId", product.getId());
put("startDate", format.format(startDate));
put("endDate", format.format(endDate));
put("action", "total");
}};
String paramString1 = "?";
for (String key : params1.keySet()) {
paramString1 += key + "=" + params1.get(key).toString() + "&";
}
HttpRequest request1 = HttpRequest
.newBuilder(URI.create("https://gjtvhjg8e9.execute-api.us-east-2.amazonaws.com/default/sales" + paramString1))
.build();
String result1 = "";
HttpClient httpClient1 = HttpClient.newHttpClient();
HttpResponse<String> response1 = null;
try {
response1 = httpClient1.send(request1, HttpResponse.BodyHandlers.ofString());
result1 = response1.body();
} catch (IOException | InterruptedException e) {
e.printStackTrace();
}
SalesTotal total = new Gson().fromJson(result1, SalesTotal.class);
if(product.getStock() quantity <= (int) ((double) (total.getTotal() / 30) * product.getLeadTime()))
alert.send(product);
}
}
view raw StockMonitor.java hosted with ❤ by GitHub

Yikes!

I think it’s beyond any reasonable doubt that this class does too much. There’s almost no separation of concerns in this design.

Now, I didn’t write any unit tests for this (because “legacy code”), but I do have a command line program I can use the StockMonitor with for manual or shell script testing. Take a look at the set-up.

public class Program {
private static StockMonitor monitor = new StockMonitor(product > {
// We are faking this for now
System.out.println(
"You need to reorder product " + product.getId() +
". Only " + product.getStock() + " remaining in stock");
});
public static void main(String[] args) {
int productId = Integer.parseInt(args[0]);
int quantity = Integer.parseInt(args[1]);
monitor.productSold(productId, quantity);
}
}
view raw Program.java hosted with ❤ by GitHub

It’s pretty small. And that’s because StockMonitor‘s dependencies are nearly all hard-wired inside it. Ironically, lack of separation of concerns in this case means a simple interface and a tiny set-up.

So, big set-ups don’t always point to a lack of modularity, and small set-ups don’t always mean that that we have modularity in our design.

Of course, what the big set-up in our integration test does mean is that this test could fail for many reasons, in many layers of our call stack. So if all our tests have big set-ups, that in itself could spell trouble.

Explore the Guitar Shack source code

What Can We Learn From The Movie Industry About Testing Feedback Loops?

In any complex creative endeavour – and, yes, software development is a creative endeavour – feedback is essential to getting it right (or, at least, less wrong).

The best development approaches tend to be built around feedback loops, and the last few decades of innovation in development practices and processes have largely focused on shrinking those feedback loops so we can learn our way to Better faster.

When we test our software, that’s a feedback loop, for example. Although far less common these days, there are still teams out there doing it manually. Their testing feedback loops can last days or even weeks. Many teams, though, write fast-running automated tests, and can test the bulk of their code in minutes or even seconds.

What difference does it make if your tests take days instead of seconds?

To illustrate, I’m going to draw a parallel with movie production. Up until the late 1960s, feedback loops in movie making were at best daily. Footage shot on film during the day were processed by a lab and then watched by directors, producers, editors and so on at the end of the day. Hence the movie industry term “dailies”. If a shot didn’t come out right – maybe the performance didn’t fit into the context of that scene with a previous scene (the classic “boom microphone in shot” or “character just ran 6 miles but is mysteriously not out of breath” spring to mind) – chances are the production team wouldn’t know until they saw the footage later.

That could well mean going back and reshooting some scenes. That means calling back the actors and the crew, and potentially remounting the whole thing if the sets have already been pulled down. Expensive. Sometimes prohibitively expensive, which is why lower-budget productions had little choice but to keep those shots in their theatrical releases.

In the 1960s, comedy directors like Jerry Lewis and Blake Edwards pioneered the use of Video assist. These were systems that enabled the same footage to be recorded simultaneously on film and on videotape, so actors and directors could watch takes back as soon as they’d been captured, and correct mistakes there and then when the actors, crew, sets and so on were all still there. Way, way cheaper than remounting.

The speed of testing feedback in software development has a similar impact. If I make a change that breaks the code, and my code is tested overnight, say, then I probably won’t know it’s broken until the next day (or the next week, or the next month, or the next year when a user reports the bug).

But I’ve already moved on. The sets have been dismantled, so to speak. To fix a bug long after the fact requires the equivalent of remounting a shoot in movies. Time has to be scheduled, the developer has to wrap their head around that code again, and the bug fix has to go through the whole testing and release process again. Far more expensive. Often orders of magnitude more expensive. Sometimes prohibitively expensive, which is why many teams ship software they know has bugs, but they just don’t have budget to fix them (or, at least, they believe they’re not worth fixing.)

If my code is tested in seconds, that’s like having Video assist. I can make one change and run the tests. If I broke the code, I’ll know there and then, and can easily fix it while I’m still in the zone.

Just as Video assist helps directors make better movies for less money, fast-running automated tests can help us deliver more reliable software with less effort. This is a measurable effect (indeed, it has been measured), so we know it works.

Stuck In Service-Oriented Hell? You Need Contract Tests

As our system architectures get increasingly distributed, many teams experience the pain of writing code that consumes services through APIs that are changing.

Typically, we don’t know that a non-backwards-compatible change to an external dependency has happened until our own code suddenly stops working. I see many organisations spending a lot of time fixing those problems. I shudder to think how much time and money, as an industry, we’re wasting on it.

Ideally, developers of APIs wouldn’t make changes that break contracts with client code. But this is not an ideal world.

What would be very useful is an early warning system that flags up the fact that a change we’ve made is going to break client code before we release it. As a general rule with bugs, the sooner we know, the cheaper they are to fix.

Contract tests are a good tool for getting that early warning. A contract test is a kind of integration test that focuses specifically on the interaction between our code and an external dependency. When they fail, they pinpoint the source of the problem: namely that something has likely changed at the other end.

There are different ways of writing contract tests, but one of favourites is to use Abstract Tests. Take this example from my Guitar Shack code:

package com.guitarshack;
import com.guitarshack.net.Network;
import com.guitarshack.net.RESTClient;
import com.guitarshack.net.RequestBuilder;
import com.guitarshack.sales.SalesData;
import org.junit.Test;
import java.util.Calendar;
import static org.junit.Assert.assertTrue;
/*
This Abstract Test allows us to create two versions of the set-up, one with
stubbed JSON and one that actually connects to the web service, effectively pinpointing
whether an error has occurred because of a change to our code or a change to the external dependency
*/
public abstract class SalesDataTestBase {
@Test
public void fetchesSalesData(){
SalesData salesData = new SalesData(new RESTClient(getNetwork(), new RequestBuilder()), () > {
Calendar calendar = Calendar.getInstance();
calendar.set(2019, Calendar.JANUARY, 31);
return calendar.getTime();
});
int total = salesData.getTotal(811);
assertTrue(total > 0);
}
protected abstract Network getNetwork();
}

The getNetwork() method is left abstract so it can be overridden in subclasses.

One implementation uses a stub implement of the Network interface that returns hardcoded JSON, so I can unit test most of my SalesData class.

package com.guitarshack.unittests;
import com.guitarshack.net.Network;
import com.guitarshack.SalesDataTestBase;
public class SalesDataUnitTest extends SalesDataTestBase {
@Override
protected Network getNetwork() {
return request > "{\"productID\":811,\"startDate\":\"7/17/2019\",\"endDate\":\"7/27/2019\",\"total\":31}";
}
}

Another implementation returns the real implementation of Network, called Web, and connects to a real web service hosted as an AWS Lambda.

package com.guitarshack.contracttests;
import com.guitarshack.*;
import com.guitarshack.net.Network;
import com.guitarshack.net.Web;
/*
If this test fails when the SalesDataUnitTest is still passing, this indicates a change
in the external API
*/
public class SalesDataContractTest extends SalesDataTestBase {
@Override
protected Network getNetwork() {
return new Web();
}
}

If the contract test suddenly starts failing while the unit test is still passing, that’s a big hint that the problem is at the other end.

To illustrate, I changed the output field ‘total’ to ‘salesTotal’ in the JSON being outputted from the web service. See what happens when I run my tests.

The contract test fails (as well as a larger integration test, that wouldn’t pinpoint the source of the problem as effectively), while the unit test version is still passing.

When I change ‘salesTotal’ back to ‘total’, all the tests pass again.

This is very handy for as a client code developer, writing code that consumes the sales data API. But it would be even more useful for the developer of that API to be able to run my contract tests, perhaps before a release, or as an overnight job, so they could get early warning that their changes have broken the contract.

For teams who are able to access each others’ code (e.g., on GitHub), that’s quite straightforward. I could rig up my Maven project to enable developers to build and run just those tests, for example. Notice that my unit and contract tests are in different packages to make that easy.

For teams who can’t access each other’s repos, it may take a little more ingenuity. But we’re probably used to seeing our code built and tested on other machines (e.g., on cloud CI servers) and getting the results back over the web. It’s not rocket science to offer Contract Testing as a Service. You could then give the API developers exclusive – possibly secured – access to your contract test builds over HTTP.

I’ve seen contract testing – done well – save organisations a lot of blood, sweat and tears. At the very least, it can defend API developers from breaking the First Law of Software Development:

Though shalt not break shit that was working

Jason Gorman