Beware False Trade-offs

Over the last 10 months we’ve seen how different governments have handled the COVID-19 pandemics in their own countries, and how nations have been impacted very differently as a result.

While countries like Italy, the United Kingdom and Belgium have more than 100 deaths per 100,000 of the population, places where governments acted much faster and more decisively, like New Zealand have a far lower mortality rate (in the case of NZ, 0.5 deaths per 100,000).

Our government made the argument that they had to balance saving lives with saving the economy. But this, it transpires, is a false dichotomy. In 2020, the UK saw GDP shrink by an estimated 11.3%. New Zealand’s economy actually grew slightly by 0.4%.

For sure, during their very stringent measures to tackle the virus, their economy shrank like everyone else’s. But having very effectively made their country COVID-free, it bounced back in a remarkable V-shaped recovery. Life in countries that took the difficult decisions earlier has mostly returned to normal. Shops, bars, restaurants, theatres and sports stadiums are open, and NZ is very much open for business.

The depressing fact is that countries like the UK made a logical error in trying to keep the economy going when they should have been cracking down on the spread of the virus. In March, cases were doubling roughly twice a week, and every week’s delay in acting cost four times as many lives. Delaying for 2 weeks in March meant that infection cases sored to a level that made the subsequent lockdown much, much longer. Hence there was a far greater impact on the economy.

Eventually, by early July, cases in the UK had almost disappeared. At which point, instead of doubling down on the measures to ensure a COVID-free UK, the government made the same mistake all over again. They opened everything up again because they mistakenly calculated that they had to get the economy moving as soon as possible.

Cases started to rise again – albeit at a slower rate this time, as most people were still taking steps to reduce risks of infection – and around we went a second time.

The next totally predictable – and totally predicted – lockdown again came weeks too late in November.

And again, as soon as they saw that cases were coming down, they reopened the economy.

We’re now in our third lockdown, and this one looks set to last until late Spring at the earliest. This time, we have vaccines on our side, and life will hopefully get back to relative normal in the summer, but the damage has been done. And, yet again, the damage is far larger than it needed to be.

50,000 families have lost their homes since March 2020. Thousands of businesses have folded. Theatres may never reopen, and city centres will probably never recover as home-working becomes the New Normal.

By trying to trade-off saving lives against the economy, countries like the UK have ended up with the worst of both worlds: one of the highest mortality rates in Europe, and one of the worst recessions.

You see, it’s not saving lives or saving the economy. It’s saving lives and saving the economy. The same steps that would have saved more lives would have made the lockdowns shorter, and therefore brought economic recovery faster.

Why am I telling you all this? Well, we have our own false dichotomies in software. The most famous one being the perceived trade-off between quality and time or cost.

An unwillingness to invest in, say, more testing sooner in the mistaken belief that it will save time leads many teams into deep water. Over three decades, I’ve seen countless times how this leads to software that’s both buggier and costs more to deliver and to maintain – the worst of both worlds.

The steps we can take to improve the quality of our software turn out to be the same steps that help us deliver it sooner, and maintain it for longer for less money. Time “wasted” writing developer tests, for example, is actually an order of magnitude more time saved downstream (where “downstream” could just as easily mean “later today” as “after release”).

But the urge to cut corners and do trade-offs is strong, especially in highly politicised environments where leaders are rarely thinking past the next headline (or in our case, the next meeting with the boss). It’s a product of timid leadership, and one-dimensional, short-term reasoning.

When we go by the evidence, we see that many trade-offs are nothing of the sort.

What Can We Learn From The Movie Industry About Testing Feedback Loops?

In any complex creative endeavour – and, yes, software development is a creative endeavour – feedback is essential to getting it right (or, at least, less wrong).

The best development approaches tend to be built around feedback loops, and the last few decades of innovation in development practices and processes have largely focused on shrinking those feedback loops so we can learn our way to Better faster.

When we test our software, that’s a feedback loop, for example. Although far less common these days, there are still teams out there doing it manually. Their testing feedback loops can last days or even weeks. Many teams, though, write fast-running automated tests, and can test the bulk of their code in minutes or even seconds.

What difference does it make if your tests take days instead of seconds?

To illustrate, I’m going to draw a parallel with movie production. Up until the late 1960s, feedback loops in movie making were at best daily. Footage shot on film during the day were processed by a lab and then watched by directors, producers, editors and so on at the end of the day. Hence the movie industry term “dailies”. If a shot didn’t come out right – maybe the performance didn’t fit into the context of that scene with a previous scene (the classic “boom microphone in shot” or “character just ran 6 miles but is mysteriously not out of breath” spring to mind) – chances are the production team wouldn’t know until they saw the footage later.

That could well mean going back and reshooting some scenes. That means calling back the actors and the crew, and potentially remounting the whole thing if the sets have already been pulled down. Expensive. Sometimes prohibitively expensive, which is why lower-budget productions had little choice but to keep those shots in their theatrical releases.

In the 1960s, comedy directors like Jerry Lewis and Blake Edwards pioneered the use of Video assist. These were systems that enabled the same footage to be recorded simultaneously on film and on videotape, so actors and directors could watch takes back as soon as they’d been captured, and correct mistakes there and then when the actors, crew, sets and so on were all still there. Way, way cheaper than remounting.

The speed of testing feedback in software development has a similar impact. If I make a change that breaks the code, and my code is tested overnight, say, then I probably won’t know it’s broken until the next day (or the next week, or the next month, or the next year when a user reports the bug).

But I’ve already moved on. The sets have been dismantled, so to speak. To fix a bug long after the fact requires the equivalent of remounting a shoot in movies. Time has to be scheduled, the developer has to wrap their head around that code again, and the bug fix has to go through the whole testing and release process again. Far more expensive. Often orders of magnitude more expensive. Sometimes prohibitively expensive, which is why many teams ship software they know has bugs, but they just don’t have budget to fix them (or, at least, they believe they’re not worth fixing.)

If my code is tested in seconds, that’s like having Video assist. I can make one change and run the tests. If I broke the code, I’ll know there and then, and can easily fix it while I’m still in the zone.

Just as Video assist helps directors make better movies for less money, fast-running automated tests can help us deliver more reliable software with less effort. This is a measurable effect (indeed, it has been measured), so we know it works.

Don’t Succumb To Illusions Of Productivity

One thing I come across very often is development teams who have adopted processes or practices that they believe are helping them go faster, but that are probably making no difference, or even slowing them down.

The illusion of productivity can be very seductive. When I bash out code without writing tests, or without refactoring, it really feels like I’m getting sh*t done. But when I measure my progress more objectively, it turns out I’m not.

That could be because typing code faster – without all those pesky interruptions – feels like delivering working software faster. But it usually takes longer to get something working when we take less care.

We seem hardwired not to notice how much time we spend fixing stuff later that didn’t need to be broken. We seem hardwired not to notice the team getting bigger and bigger as the bug count and the code smells and the technical debt pile up. We seem hardwired not to notice the merge hell we seem to end up in every week as developers try to get their changes into the trunk.

We just feel like…

Getting sh*t done

Not writing automated tests is one classic example. I mean, of course unit tests slow us down! It’s, like, twice as much code! The reality, though, is that without fast-running regression tests, we usually end up spending most of our time fixing bugs when we could be adding value to the product. The downstream costs typically outweigh the up-front investment in unit tests. Skipping tests is almost always a false economy, even on relatively short projects. I’ve measured myself with and without unit tests, and on ~1 hour exercises, and I’m slightly faster with them. Typing is not the bottleneck.

Another example is when teams mistakenly believe that working on separate branches of the code will reduce bottlenecks in their delivery pipelines. Again, it feels like we’re getting more done as we hack away in our own isolated sandboxes. But this, too, is an illusion. It doesn’t matter how many lanes the motorway has if every vehicle has to drive on to the same ferry at the end of it. No matter how many parallel dev branches you have, there’s only one branch deployments can be made from, and all those parallel changes have to somehow make it into that branch eventually. And the less often developers merge, the more changes in each merge. And the more changes in each merge, the more conflicts. And, hey presto, merge hell.

Closely related is the desire of many developers to work without “interruptions”. It may feel like sh*t’s getting done when the developers go off into their cubicles, stick their noise-cancelling headphones on, and hunker down on a problem. But you’d be surprised just how much communication and coordination’s required to avoid some serious misunderstandings. I recall working on a team where we ended up with three different architectures and four customer tables in the database, because my colleagues felt that standing around a whiteboard drawing pictures – or, as they called it, “meetings” – was a waste of valuable Getting Sh*t Done time. With just a couple of weeks of corrective work, we were able to save ourselves 20 minutes around a whiteboard. Go us!

I guess my message is simple. In software development, productivity doesn’t look like this:

Don’t be fooled by that illusion.

Slow Tests Kill Businesses

I’m always surprised at how few organisations track some pretty fundamental stats about software development, because if they did then they might notice what’s been killing their business.

It’s a picture I’ve seen many, many times; a software product or system is created, and it goes live. But it has bugs. Many bugs. So, a bigger chunk of the available development time is used up fixing bugs for the second release. Which has even more bugs. Many, many bugs. So an even bigger chunk of the time is used to fix bugs for the third release.

It looks a little like this:

Over the lifetime of the software, the proportion of development time devoted to bug fixing increases until that’s pretty much all the developers are doing. There’s precious little time left for new features.

Naturally, if you can only spare 10% of available dev time for new features, you’re going to need 10 times as many developers. Right? This trend is almost always accompanied by rapid growth of the team.

So the 90% of dev time you’re spending on bug fixing is actually 90% of the time of a team that’s 10x as large – 900% of the cost of your first release, just fixing bugs.

So every new feature ends up in real terms costing 10x in the eighth release what it would have in the first. For most businesses, this rules out change – unless they’re super, super successful (i.e., lucky). It’s just too damned expensive.

And when you can’t change your software and your systems, you can’t change the way you do business at scale. Your business model gets baked in – petrified, if you like. And all you can do is throw an ever-dwindling pot of money at development just to stand still, while you watch your competitors glide past you with innovations you’ll never be able to offer your customers.

What happens to a business like that? Well, they’re no longer in business. Customers defected in greater and greater numbers to competitor products, frustrated by the flakiness of the product and tired of being fobbed off with promises about upgrades and hotly requested features and fixes that never arrived.

Now, this effect is entirely predictable. We’ve known about it for many decades, and we’ve known the causal mechanism, too.

Source: IBM System Science Institute

The longer a bug goes undetected, exponentially the more it costs to fix. In terms of process, the sooner we test new or changed code, the cheaper the fix is. This effect is so marked that teams actually find that if they speed up testing feedback loops – testing earlier and more often – they deliver working software faster.

This is very simply because they save more time downstream on bug fixes than they invest in earlier and more frequent testing.

The data used in the first two graphs was taken from a team that took more than 24 hours to build and test their code.

Here’s the same stats from a team who could build and test their code in less than 2 minutes (I’ve converted from releases to quarters to roughly match the 12-24 week release cycles of the first team – this second team was actually releasing every week):

This team has nearly doubled in size over the two years, which might sound bad – but it’s more of a rosy picture than the first team, whose costs spiraled to more than 1000% of their first release, most of which was being spent fixing bugs and effectively going round and round in circles chasing their own tails while their customers defected in droves.

I’ve seen this effect repeated in business after business – of all shapes and sizes: software companies, banks, retail chains, law firms, broadcasters, you name it. I’ve watched $billion businesses – some more than a century old – brought down by their inability to change their software and their business-critical systems.

And every time I got down to the root cause, there they were – slow tests.

Every. Single. Time.

Where’s User Experience In Your Development Process?

I ran a little poll through the Codemanship twitter account yesterday, and thought I’d share the result with you.

There are two things that strike me about the results. Firstly, it looks like teams who actively involve user experience experts throughout the design process are very much in the minority. To be honest, this comes as no great surprise. My own observations of development teams over years tend to see UXD folks getting involved early on – often before any developers are involved, or any customer tests have been discussed – in a kind of a Waterfall fashion. “We’re agile. But the user interface design must not change.”

To me, this is as nonsensical as those times when I’ve arrived on a project that has no use cases or customer tests, but somehow magically has a very fleshed-out database schema that we are not allowed to change.

Let’s be clear about this: the purpose of the user experience is to enable the user to achieve their goals. That is a discussion for everybody involved in the design process. It’s also something that is unlikely we’ll get right first time, so iterating the UXD multiple times with the benefit of end user feedback almost certainly will be necessary.

The most effective teams do not organise themselves into functional silos of requirements analysis, UXD, architecture, programming, security, data management, testing, release and operations & support and so on, throwing some kind of output (a use case, a wireframe, a UML diagram, source code, etc) over the wall to the next function.

The most effective teams organise themselves around achieving a goal. Whoever’s needed to deliver on that should be in the room – especially when those goals are being discussed and agreed.

I could have worded the question in my poll “User Experience Designers: when you explore user goals, how often are the developers involved?” I suspect the results would have been similar. Because it’s the same discussion.

On a movie production, you have people who write scripts, people who say the lines, people who create sets, people who design costumes, and so on. But, whatever their function, they are all telling the same story.

The realisation of working software requires multiple disciplines, and all of them should be serving the story. The best teams recognise this, and involve all of the disciplines early and throughout the process.

But, sadly, this still seems quite rare. I hear lip service being paid, but see little concrete evidence that it’s actually going on.

The second thing I noticed about this poll is that, despite several retweets, the response is actually pretty low compared to previous polls. This, I suspect, also tells a story. I know from both observation and from polls that teams who actively engage with their customers – let alone UXD professionals etc – in their BDD/ATDD process are a small minority (maybe about 20%). Most teams write the “customer tests” themselves, and mistake using a BDD tool like Cucumber for actually doing BDD.

But I also get a distinct sense, working with many dev teams, that UXD just isn’t on their radar. That is somebody else’s problem. This is a major, major miscalculation – every bit as much as believing that quality assurance is somebody else’s problem. Any line of code that doesn’t in some way change the user’s experience – and I use the term “user” in the wider sense that includes, for example, people supporting the software in production, who will have their own user experience – is a line of code that should be deleted. Who is it for? Whose story does it serve?

We are all involved in creating the user experience. Bad special effects can ruin a movie, you know.

We may not all be qualified in UXD, of course. And that’s why the experts need to be involved in the ongoing design process, because UX decisions are being taken throughout development. It only ends when the software ends (and even that process – decommissioning – is a user experience).

Likewise, every decision a UI designer takes will have technical implications, and they may not be the experts in that. Which is why the other disciplines need to be involved from the start. It’s very easy to write a throwaway line in your movie script like “Oh look, it’s Bill, and he’s brought 100,000 giant fighting robots with him”, but writing 100,000 giant fighting robots and making 100,000 giant fighting robots actually appear on the screen are two very different propositions.

So let’s move on from the days of developers being handed wire-frames and told to “code this up”, and from developers squeezing input validation error messages into random parts of web forms, and bring these – and all the other – disciplines together into what I would call a “development team”.

The Software Design Process

One thing that sadly rarely gets discussed these days is how we design software. That is, how we get from a concept to working code.

As a student (and teacher) of software design and architecture of many years, experiencing first-hand many different methodologies from rigorous to ad hoc, heavyweight to agile, I can see similarities between all effective approaches.

Whether you’re UML-ing or BDD-ing or Event Storming-ing your designs, when it works, the thought process is the same.

It starts with a goal.

This – more often than not – is a problem that our customer needs solving.

This, of course, is where most teams get the design thinking wrong. They don’t start with a goal – or if they do, most of the team aren’t involved at that point, and subsequently are not made aware of what the original goal or problem was. They’re just handed a list of features and told “build that”, with no real idea what it’s for.

But they should start with a goal.

In design workshops, I encourage teams to articulate the goal as a single, simple problem statement. e.g.,

It’s really hard to find good vegan takeaway in my area.

Jason Gorman, just now

Our goal is to make it easier to order vegan takeaway food. This, naturally, begs the question: how hard is it to order vegan takeaway today?

If our target customer area is Greater London, then at this point we need to hit the proverbial streets and collect data to help us answer that question. Perhaps we could pick some random locations – N, E, S and W London – and try to order vegan takeaway using existing solutions, like Google Maps, Deliveroo and even the Yellow Pages.

Our data set gives us some numbers. On average, it took 47 minutes to find a takeaway restaurant with decent vegan options. They were, on average, 5.2 miles from the random delivery address. The orders took a further 52 minutes to be delivered. In 19% of selected delivery addresses, we were unable to order vegan takeaway at all.

What I’ve just done there is apply a simple thought process known as Goal-Question-Metric.

We ask ourselves, which of these do we think we could improve on with a software solution? I’m not at all convinced software would make the restaurants cook the food faster. Nor will it make the traffic in London less of an obstacle, so delivery times are unlikely to speed up much.

But if our data suggested that to find a vegan menu from a restaurant that will deliver to our address we had to search a bunch of different sources – including telephone directories – then I think that’s something we could improve on. It hints strongly that lack of vegan options isn’t the problem, just the ease of finding them.

A single searchable list of all takeaway restaurants with decent vegan options in Greater London might speed up our search. Note that word: MIGHT.

I’ve long advocated that software specifications be called “theories”, not “solutions”. We believe that if we had a searchable list of all those restaurants we had to look in multiple directories for, that would make the search much quicker, and potentially reduce the incidences when no option was found.

Importantly, we can compare the before and the after – using the examples we pulled from the real world – to see if our solution actually does improve search times and hit rates.

Yes. Tests. We like tests.

Think about it; we describe our modern development processes as iterative. But what does that really mean? To me – a physics graduate – it implies a goal-seeking process that applies a process over and over to an input, the output of which is fed into the next cycle, which converges on a stable working solution.

Importantly, if there’s no goal, and/or no way of knowing if the goal’s been achieved, then the process doesn’t work. The wheels are turning, the engine’s revving, but we ain’t going anywhere in particular.

Now, be honest, when have you ever been involved in a design process that started like that? But this is where good design starts: with a goal.

So, we have a goal – articulated in a testable way, importantly. What next?

Next, we imaginate (or is it visionize? I can never keep up with the management-speak) a feature – a proverbial button the user clicks – that solves their problem. What does it do?

Don’t think about how it works. Just focus on visualifying (I’m getting the hang of this now) what happens when the user clicks that magical button.

In our case, we imagine that when the user clicks the Big Magic Button of Destiny, they’re shown a list of takeaway restaurants with a decent vegan menu who can deliver to their address within a specified time (e.g., 45 minutes).

That’s our headline feature. A headline feature is the feature that solves the customer’s problem, and – therefore – is the reason for the system to exist. No, “Login” is never a headline feature. Nobody uses software because they want to log in.

Now we have a testable goal and a headline feature that solves the customer’s problem. It’s time to think about how that headline feature could work.

We would need a complete list of takeaway restaurants with decent vegan menus within any potential delivery address in our target area of Greater London.

We would need to know how long it might take to deliver from each restaurant to the customer’s address.

This would include knowing if the restaurant is still taking orders at that time.

Our headline feature will require other features to make it work. I call these supporting features. They exist only because of the headline feature – the one that solves the problem. The customer doesn’t want a database. They want vegan takeaway, damn it!

Our simple system will need a way to add restaurants to the list. It will need a way to estimate delivery times (including food preparation) between restaurant and customer addresses – and this may change (e.g., during busy times). It will need a way for restaurants to indicate if they’re accepting orders in real time.

At this point, you may be envisaging some fancypants Uber Eats style of solution with whizzy maps showing delivery drivers aimlessly circling your street for 10 minutes because nobody reads the damn instructions these days. Grrr.

But it ain’t necessarily so. This early on in the design process is no time for whizzy. Whizzy comes later. If ever. Remember, we’re setting out here to solve a problem, not build a whizzy solution.

I’ve seen some very high-profile applications go live with data entry interfaces knocked together in MS Access for that first simple release, for example. Remember, this isn’t a system for adding restaurant listings. This is a system for finding vegan takeaway. The headline feature’s always front-and-centre – our highest priority.

Also remember, we don’t know if this solution is actually going to solve the problem. The sooner we can test that, the sooner we can start iterating towards something better. And the simpler the solution, the sooner we can put it in the hands of end users. Let’s face it, there’s a bit of smoke and mirrors to even the most mature software solutions. We should know; we’ve looked behind the curtain and we know there’s no actual Wizard.

Once we’re talking about features like “Search for takeaway”, we should be in familiar territory. But even here, far too many teams don’t really grok how to get from a feature to working code.

But this thought process should be ingrained in every developer. Sing along if you know the words:

  • Who is the user and what do they want to do?
  • What jobs does the software need to do to give them that?
  • What data is required to do those jobs?
  • How can the work and the data be packaged together (e.g., in classes)
  • How will those modules talk to each other to coordinate the work end-to-end?

This is the essence of high-level modular software design. The syntax may vary (classes, modules, components, services, microservices, lambdas), but the thinking is the same. The user has needs (find vegan takeaway nearby). The software does work to satisfy those needs (e.g., estimate travel time). That work involves data (e.g., the addresses of restaurant and customer). Work and data can be packaged into discrete modules (e.g., DeliveryTimeEstimator). Those modules will need to call other modules to do related work (e.g., address.asLatLong()), and will therefore need “line of sight” – otherwise known as a dependency – to send that message.

You can capture this in a multitude of different ways – Class-Responsibility-Collaboration (CRC) cards, UML sequence diagrams… heck, embroider it on a tapestry for all I care. The thought process is the same.

This birds-eye view of the modules, their responsibilities and their dependencies needs to be translated into whichever technology you’ve selected to build this with. Maybe the modules are Java classes. Maybe their AWS lambdas. Maybe they’re COBOL programs.

Here we should be in writing code mode. I’ve found that if your on-paper (or on tapestry, if you chose that route) design thinking goes into detail, then it’s adding no value. Code is for details.

Start writing automated tests. Now that really should be familiar territory for every dev team.

/ sigh /

The design thinking never stops, though. For one, remember that everything so far is a theory. As we get our hands dirty in the details, our high-level design is likely to change. The best laid plans of mice and architects…

And, as the code emerges one test at a time, there’s more we need to think about. Our primary goal is to build something that solves the customer’s problem. But there are secondary goals – for example, how easy it will be to change this code when we inevitably learn that it didn’t solve the problem (or when the problem changes).

Most kitchen designs you can cater a dinner party in. But not every kitchen is easy to change.

It’s vital to remember that this is an iterative process. It only works if we can go around again. And again. And again. So organising our code in a way that makes it easy to change is super-important.

Enter stage left: refactoring.

Half the design decisions we make will be made after we’ve written the code that does the job. We may realise that a function or method is too big or too complicated and break it down. We may realise that names we’ve chosen make the code hard to understand, and rename. We may see duplication that could be generalised into a single, reusable abstraction.

Rule of thumb: if your high-level design includes abstractions (e.g., interfaces, design patterns, etc), you’ve detailed too early.

Jason Gorman, probably on a Thursday

The need for abstractions emerges organically as the code grows, through the process of reviewing and refactoring that code. We don’t plan to use factories or the strategy pattern, or to have a Vendor interface, in our solution. We discover the need for them to solve problems of software maintainability.

By applying organising principles like Simple Design, D.R.Y. Tell, Don’t Ask, Single Responsibility and the rest to the code is it grows, good, maintainable modular designs will emerge – often in unexpected ways. Let go of your planned architecture, and let the code guide you. Face it, it was going to be wrong anyway. Trust me: I know.

Here’s another place that far too many teams go wrong. As your code grows and an architecture emerges, it’s very, very helpful to maintain a birds-eye view of what that emerging architecture is becoming. Ongoing visualisation of the software – its modules, patterns, dependencies and so on – is something surprisingly few teams do these days. Working on agile teams, I’ve invested some of my time to creating and maintaining these maps of the actual terrain and displaying them prominently in the team’s area – domain models, UX storyboards, key patterns we’ve applied (e.g., how have we done MVC?) You’d be amazed what gets missed when everyone’s buried in code, neck-deep in details, and nobody’s keeping an eye on the bigger picture. This, regrettably, is becoming a lost skill – the baby Agile threw out with the bathwater.

So we build our theoretical solution, and deliver it to end users to try. And this is where the design process really starts.

Until working code meets the real world, it’s all guesswork at best. We may learn that some of the restaurants are actually using dairy products in the preparation of their “vegan” dishes. Those naughty people! We may discover that different customers have very different ideas about what a “decent vegan menu” looks like. We may learn that our estimated delivery times are wildly inaccurate because restaurants tell fibs to get more orders. We may get hundreds of spoof orders from teenagers messing with the app from the other side of the world.

Here’s my point: once the system hits the real world, whatever we thought was going to happen almost certainly won’t. There are always lessons that can only be learned by trying it for real.

So we go again. And that is the true essence of software design.

When are we done? When we’ve solved the problem.

And then we move on to the next problem. (e.g., “Yeah, vegan food’s great, but what about vegan booze?”)

Will There Be A Post-Pandemic IT Boom?

For billions of people around the world, things are pretty uncertain now. Hundreds of millions have lost their jobs. Businesses of all sizes – but especially smaller and newer businesses, many start-ups – are in trouble. Many have already folded.

The experts predict a recession the likes of which we haven’t seen in anyone’s lifetime. But there may be one sector that – as the dust settles – might even grow faster as a result of the pandemic.

Information and communications technology has come to the fore as country after country locked down, commanding businesses who could to let their employees work from home. This would not have been possible a generation ago for the vast majority. Most homes did not have computers, and almost no homes had Internet. Now, it’s the reverse.

While some household name brands have run into serious difficulties, new brands have become household names in the last 3 months – companies like Zoom, for example. “Zooming” is now as much a thing as “hoovering”.

Meanwhile, tens of thousands of established businesses have had their digital transformations stress-tested for the first time, and have found them wanting. From extreme cases like UK retailer Primark, who effectively had no online capability, to old hands in every sector who’ve invested billions in digital over the last 30 years, it seems most were not quite as “digital” as it turns out they needed to be.

From customer-facing transactions to internal business processes, the pandemic has revealed gaps that were being filled by people necessarily co-located in offices and shops and factories and so on. A client of mine, for example, still doesn’t have the ability to sign up new suppliers without a human being in the accounts department to access the mainframe via one of the dedicated terminals. They are rushing now to close that gap, but their mainframe skills base has dwindled to the point where nobody knows how. So they have to hire someone with COBOL skills to make the changes on that side, and C# skills to write the web front end for it. Good luck with that!

I’m noticing these digital gaps everywhere. Most organisations have them. They were missed because the processes still worked, thanks to the magic of People Going To OfficesTM. But now those gaps have been laid bare for everyone to see (and for customers and suppliers to experience).

Here’s the thing: thing’s aren’t going back to normal. The virus is going to be with us for some time, and even after we’ve tamed it with a vaccine or new treatments, everyone will be thinking about the next new virus. Just as COVID-19 leaves it mark on people it infects, the pandemic will leave its mark on our civilisation. We will adapt to a new normal. And a big component of that new normal will be digital technology. As wars accelerate science and technology, so too will COVID-19 accelerate digital innovation.

And it’s a match made in heaven, because this innovation can largely be done from our homes, thanks to… digital technology! It’s a self-accelerating evolution.

So, I have an inkling we’re going to be very busy in the near future.

10 Things Every *Good* Software Development Method Does

I’ve been a programmer for the best part of four decades – three of them professionally – and, for the last 25 years, a keen student of this thing we call “software development”.

I’ve studied and applied a range of software development methods, principles, and techniques over those years. While, on the surface, Fusion may look different to the Unified Process, which may look different to Extreme Programming, which may look different to DSDM, which may look different to Cleanroom Software Engineering, when you look under the hood of these approaches, they actually have some fundamental things in common.

Here are the 10 things every software developer should know:

  1. Design starts with end users and their goals – be it with use cases, or with user stories, or with the “features” of Feature-Driven Development, the best development approaches drive their solution designs by first asking: Who will be using this software, and what will they be using it to do?
  2. Designs grow one usage scenario at a time – scenarios or examples drive the best solution designs, and those designs are fleshed out one scenario at a time to satisfy the user’s goal in “happy paths” (or to recover gracefully from not satisfying the user’s goal, which we call “edge cases”). Developers who try to consider multiple scenarios simultaneously tend to bite off more than they can chew.
  3. Solutions are delivered one scenario at a time – teams who deliver working software in end-to-end slices of functionality (e.g., the UI, business logic and database required to do a thing the user requires) tend to fare better than teams who deliver horizontal slices across their architecture (the UI components for all scenarios, and then the business logic, and then the database code). This is ffor two key reasons. Firstly, they can get user feedback from working features sooner, which speeds up the learning process. Secondly, if they only manage to deliver 75% of the software before a release date, they will have delivered 75% of end-to-end working features, instead of 75% of the layers of all features. We call this incremental delivery.
  4. Solutions evolve based on user feedback from increments – the other key ingredient in the way we deliver working software is how we learn from the feedback we get from end users in each increment of the software. With the finest requirements and design processes – and the best will in the world – we can’t expect to get it right first time. Maybe our solution doesn’t give them what they wanted. Maybe what they wanted turns out to be not what they really needed. The only way to find out for sure is to deliver what they asked for and let them take it for a spin. And then the feedback starts flooding in. The best approaches accept that feedback is not just unavoidable, it’s very desirable, and teams seek it out as often as possible.
  5. Plans change – if we can’t know whether we’re delivering the right software for sure until we’ve delivered it, then our approach to planning must be highly adaptable. Although the wasteland of real-world software development is littered with the bleached bones of “waterfall” projects that attempted to get it right first time (and inevitably failed), the idealised world of software development methods rejected that idea many decades ago. All serious methods are iterative, and all serious methods tell us that the plan will necessarily change. It’s management who resist change, not methods.
  6. Code changes – if plans change based on what we learn from end users, then it stands to reason that our code must also change to accommodate their feedback. This is the sticking point on many “agile” development teams. Their management processes may allow for the plan to change, but their technical practices (or the lack of them) may mean that changing the code is difficult, expensive and risky. There are a range of factors in the cost of changing software, but in the wider perspective, it essentially boils down to “How long will it take to deliver the next working iteration to end users?” If the answer is “months”, then change is going to be slow and the users’ feedback will be backed up like the LA freeway on a Monday morning. If it’s “minutes” then you can iterate very rapidly and learn your way to getting it right much faster. Delivery cycles are fundamental. They’re the metabolism of software development.
  7. Testing is fast and continuous – if the delivery cycle of the team is its metabolism, then testing is its thyroid. How long it takes to establish if our software’s broken will determine how fast our delivery cycle’ can be (if the goal is to avoid delivering broken software, of course.) If you aspire to a delivery cycle of minutes, then that leaves minutes to re-test your software. If all your testing’s done manually, then a modestly complex system will likely take weeks to re-test. And it’s a double whammy. Studies show that the longer a bug goes undetected, the exponentially greater it costs to fix it. If I break some code now and find out a minute from now, it’s a trifle to fix it. If I find out 6 weeks from now, it’s a whole other ball game. Teams who leave testing late typically end up spending most of their time fixing bugs instead of delivering valuable features and changes. All of this can profoundly impact delivery cycles and the cost of adapting to user feedback. Testing early and often is a feature of all serious methods. Automating our tests so they run fast is a feature of all the best methods.
  8. All work is undo-able – If we accept that its completely unrealistic to expect to get things right first time, then we must also accept that all the work we do is essentially an experiment from which we must learn. Sometimes, what we’ll learn is that what we’ve done is simply no good, and we need to do over. Software Configuration Management (of which version control is the central pillar) is a key component of all serious software development methods. A practice like Continuous Integration, done right, can bring us high levels of undo-ability, which massively reduces risk in what is a pretty risky endeavour. To use an analogy, think of software development as a multi-level computer game. Experienced gamers know to back up their place in the game frequently, so they don’t have to replay huge parts of it after a boo-boo. Same thing with version control and SCM. We don’t want our versions to be too far apart, or we’ll end up in a situation where we have to redo weeks or months of work because we took a wrong turn in the maze.
  9. Architecture is a process (not a person or a thing) – The best development methods treat software architecture and design as an ongoing activity that involves all stakeholders and is never finished. Good architectures are driven directly from user goals, ensuring that those goals are satisfied by the design above all else (e.g., use case realisations in the Unified Process), and applying organising principles – Simple Design, “Tell, Don’t Ask”, SOLID etc – to the internals of the solution design to ensure the code will be malleable enough to change to meet future needs. As an activity, architecture encompasses everything from the goals and tasks of end users, to the modular structure of the solution, to the everyday refactorings that are performed against code that falls short, the test suites that guard against regressions, the documentation that ships with the end product, and everything else which is informed by the design process. Since architecture is all-encompassing, all serious development methods mandate that it be a shared responsibility. The best methods strongly encourage a high level of architectural awareness within the team through continuous visualisation and review of the design. To some extent, everyone involved is defining the architecture. It is ever-changing and everyone’s responsibility.
  10. “Done” means we achieved the customer’s end goal – All of our work is for nothing if we don’t solve the problem we set out to solve. Too many teams are short-sighted when it comes to evaluating their success, considering only that a list of requested features was delivered, or that a product vision was realised. But all that tells us is that we administered the medicine. It doesn’t tell us if the medicine worked. If iterative development is a search algorithm, then it’s a goal-seeking search algorithm. One generation of working software at a time, we ask our end users to test the solution as a fit to their problem, learn what worked and what didn’t, and then go around again with an improved solution. We’re not “done” until the problem’s been solved. While many teams pay lip service to business goals or a business context, it’s often more as an exercise in arse-covering – “We need a business case to justify this £10,000,000 CRM system we’ve decided to build anyway!” – than the ultimate driver of the whole development process. Any approach that makes defining the end goal a part of the development process has put the cart before the horse. If we don’t have an end goal – a problem to be solved – then development shouldn’t begin. But all iterative development methods – and they’re all iterative to some degree – can be augmented with an outer feedback loop that considers business goals and tests working software in business situations, driving everything from there.

As a methodologist, I could spin you up an infinite number of software development methods with names like Goal-Oriented Object Delivery, or Customer Requirement Architectural Process. And, on the surface, I could make them all look quite different. But scratch the surface, and they’d all be fundamentally the same, in much the same way that programming languages – when you look past their syntax – tend to embrace the same underlying computing concepts.

Save yourself some time. Embrace the concepts.

Automate, Automate, Autonomy!

Thanks to pandemic-induced economic chaos, you’ve been forced to take a job on the quality assurance line at a factory that produces things.

The machine creates all kinds of random things, but your employer only sells a very specific subset of those things. All the things that don’t fit the profile have to be rejected, and melted down and fed back into the machine to make more things.

On your first day, you get training. (Oh, would that were true in software development!)

They stand you at the quality gate and start up the machine. All kinds of things come down the line at you. Your line manager tells you “Only let the green things through”. You grab all the things that aren’t green and throw them into the recycle bin. So far, so good.

“Only let the green round things through!” shouts your line manager. Okay, you think. Bit harder now. All non-green, non-round things go in the bin.

“Only let the green round small things through!” Now you’re really having to concentrate, a few green round small things end up in the bin, and a few non-green, non-round, non-small things get through.

“Only let the green round small things with Japanese writing on them through!” That’s a lot to process at the same time. Now your brain is struggling to cope. A bunch of blue and red things with Japanese writing on them get through. A bunch of square things get through. Your score has gone from 100% accurate to just 90%. Either someone will have to go through the boxes that have been packed and pick out all the rejects, or they’ll have to deal with 10% customer returns after they’ve been shipped.

“Only let the green round small things with Japanese writing on them that have beveled edges and a USB charging port on the opposite side to the writing and a power button in the middle of the writing and a picture of a horse  – not a donkey, mind, reject those ones! – and that glow in the dark through!”

Now it’s chaos. Almost every box shipped contains things that should have been thrown in the recycle bin. Almost every order gets returned. That’s just too much to process. Too many criteria.

We have several choices here:

  1. Slow down the line so we can methodically examine every thing against our checklist, one criteria at a time.
  2. Hire a whole bunch of people and give them one check each to do.
  3. Reset customer expectations about the quality of the things they’re buying.
  4. Automate the checking using cameras and robots and lasers and super-advanced A.I. so all those checks can be made at production speed to a high enough accuracy.

Number 4 is the option that potentially gives us the win-win of customer satisfaction and high productivity without the bigger payroll. It’s been the driving force behind the manufacturing revolutions in East Asia for the last 70 years: automate, automate, automate.

But it doesn’t come for free. High levels of automation require considerable ongoing investment in time, technology and training. In the UK, we’ve under-invested, becoming more and more inefficient and expensive while the quality of our output has declined. Shareholders want their return now. There’s no appetite for making improvements for the future.

There are obvious parallels in software development. Businesses want their software now. Most software organisations have little inclination to invest the time, technology and training required to reach the high levels of automation needed to achieve the coveted continuous delivery that would allow them to satisfy customer needs sooner, cheaper, and for longer.

The inescapable reality is that frictionless delivery demands an investment of 20-25% of your total software development budget. To put it more bluntly, everyone should be spending 1 day a week not on immediate customer requirements, but on making improvements in the delivery process that would mean meeting future customer requirements will be easier.

And so, for most teams, it never gets easier. The software just gets buggier, later and more expensive year after year.

What distinguishes those software teams who are getting it right from the rest? From observation, I’ve seen the same factor every time: autonomy. Teams will invest that 20-25% when it’s their choice. They’re tasked with delivering value, and allowed to figure out how best to do that. Nobody’s telling them how to do their jobs.

How did this blissful state come about? Again, from observation, those teams have autonomy because they took it. Freedom is rarely willingly given.

Now, I appreciate this is a whole can of worms. To take their autonomy, teams need to earn trust. The more trust a team has earned, the more likely they’ll be left alone. And this can be a chicken and egg kind of situation. To earn trust, the team has to reliably deliver. To reliably deliver, the team needs autonomy. This whole process must begin with a leap of faith on the business’s part. In other words, they have to give teams the benefit of the doubt long enough to see the results.

And here come the worms… Teams have to win over their customer from the start, before anything’s been delivered – before the customer’s had a chance to taste our pudding. This means that developers need to inspire enough confidence with their non-technical stakeholders – remember, this is a big money game – to reassure everyone that they’re in good hands. And we’re really, really bad at this.

The temptation is to over-promise, and set unrealistic expectations. This pretty much guarantees disappointment. The best way to inspire confidence is to have a good track record. No lawyer can guarantee to win your case. But a lawyer who won 9 of their last 10 cases is going to inspire more confidence than a lawyer who’s taking this as their first case promising you a win.

And we’re really, really bad at this, too – chiefly because software development teams are newly formed for that specific piece of work and don’t have a track record to speak of. Sure, individual developers may be known quantities, but in software, the unit of delivery is the team. I’ve watched teams of individually very strong developers fall flat on their collective arse.

And this is why I believe that this picture won’t change until organisations start to view teams as assets, and invest in them for a long-term pay-off as well as short-term delivery, 20/80. And, again, I don’t think this will be willingly given. So maybe we – as a profession – need to take the decision out of their hands.

It could all start with one big act of collective autonomy.

 

 

Why COBOL May Be The Language In Your Future

Yes, I know. Preposterous! COBOL’s 61 years old, and when was the last time you bumped into a COBOL programmer still working? Surely, Java is the new COBOL, right?

Think again. COBOL is alive and well. Some 220 billion lines of it power 71% of Fortune 500 companies. If a business is big enough and been around long enough, there’s a good chance the lion’s share of the transactions you do with that business involve some COBOL.

Fact is, they’re kind of stuck with it. Mainframe systems represent a multi-trillion dollar investment going back many decades. COBOL ain’t going nowhere for the foreseeable future.

What’s going is not the language but the programmers who know it and who know those critical business systems. The average age of a COBOL programmer in 2014 was 55. No doubt in 2020 it’s older than that, as young people entering IT aren’t exactly lining up to learn COBOL. Colleges don’t teach it, and you rarely hear it mentioned within the software development community. COBOL just isn’t sexy in the way Go or Python are.

As the COBOL programmer community edges towards statistical retirement – with the majority already retired (and frankly, dead) – the question looms: who is going to maintain these systems in 10 years or 20 years time?

One thing we know for sure: businesses have two choices – they can either replace the programmers, or replace the programs. Replacing legacy COBOL systems has proven to be very time-consuming and expensive for some banks. Commonwealth Bank of Australia took 5 years and $750 million to replace its core COBOL platform in 2012, for example.

And to replace a COBOL program, developers writing the new code at least need to be able to read the old code, which will require a good understanding of COBOL. There’s no getting around it: a bunch of us are going to have to learn COBOL one way or another.

I did a few months of COBOL programming in the mid-1990s, and I’d be lying if I said I enjoyed it. Compared to modern languages like Ruby and C#, COBOL is clunky and hard work.

But I’d also be lying if I said that COBOL can’t be made to work in the context of modern software development. In 1995, we “version controlled” our source files by replacing listings in cupboards. We tested our programs manually (if we tested them at all before going live). Our release processes were effectively the same as editing source files on the live server (on the mainframe, in this case).

But it didn’t need to be like that. You can manage versions of your COBOL source files in a VCS like Git. You can write unit tests for COBOL programs. You can do TDD in COBOL (see Exhibit A below).

IDENTIFICATION DIVISION.
PROGRAM-ID. BASKET-TEST.
DATA DIVISION.
FILE SECTION.
WORKING-STORAGE SECTION.
COPY 'total_params.cpy'.
COPY 'test_context.cpy'.
01 expected PIC 9(04)V9(2).
PROCEDURE DIVISION.
MAIN-PROCEDURE.
PERFORM EMPTY-BASKET.
PERFORM SINGLE_ITEM.
PERFORM TWO_ITEMS.
PERFORM QUANTITY_TWO.
DISPLAY 'Tests passed: ' passes.
DISPLAY 'Tests failed: ' fails.
DISPLAY 'Tests run: ' totalRun.
STOP RUN.
EMPTY-BASKET.
INITIALIZE basket REPLACING NUMERIC DATA BY ZEROES.
CALL 'TOTAL' USING basket, total.
MOVE 0 TO expected.
CALL 'ASSERT_EQUAL' USING 'EMPTY BASKET',
expected, total, test-context.
SINGLE_ITEM.
INITIALIZE basket REPLACING NUMERIC DATA BY ZEROES.
MOVE 100 TO unitprice(1).
MOVE 1 TO quantity(1).
CALL 'TOTAL' USING basket, total.
MOVE 100 TO expected.
CALL 'ASSERT_EQUAL' USING 'SINGLE_ITEM',
expected, total, test-context.
TWO_ITEMS.
INITIALIZE basket REPLACING NUMERIC DATA BY ZEROES.
MOVE 100 TO unitprice(1).
MOVE 1 TO quantity(1).
MOVE 200 TO unitprice(2).
MOVE 1 TO quantity(2).
CALL 'TOTAL' USING basket, total.
MOVE 300 TO expected.
CALL 'ASSERT_EQUAL' USING 'TWO_ITEMS',
expected, total, test-context.
QUANTITY_TWO.
INITIALIZE basket REPLACING NUMERIC DATA BY ZEROES.
MOVE 100 TO unitprice(1).
MOVE 2 TO quantity(1).
CALL 'TOTAL' USING basket, total.
MOVE 200 TO expected.
CALL 'ASSERT_EQUAL' USING 'QUANTITY_TWO',
expected, total, test-context.
END PROGRAM BASKET-TEST.

view raw
basket_test.cbl
hosted with ❤ by GitHub

You can refactor COBOL code (“Extract Paragraph”, “Extract Program”, “Move Field” etc), and you can automate a proper build an release process to deploy changed code safely to a mainframe (and roll it back if there’s a problem).

It’s possible to be agile in COBOL. The reason why so much COBOL legacy code fails in that respect has much more to do with decades of poor programming practices and very little to do with the language or the associated tools themselves.

I predict that, as more legacy COBOL programmers retire, the demand – and the pay – for COBOL programmers will rise to a point where some of you out there will find it irresistible.  And the impact on society if they can’t be found will be severe.

The next generation of COBOL programmers may well be us.