The Software Design Process

One thing that sadly rarely gets discussed these days is how we design software. That is, how we get from a concept to working code.

As a student (and teacher) of software design and architecture of many years, experiencing first-hand many different methodologies from rigorous to ad hoc, heavyweight to agile, I can see similarities between all effective approaches.

Whether you’re UML-ing or BDD-ing or Event Storming-ing your designs, when it works, the thought process is the same.

It starts with a goal.

This – more often than not – is a problem that our customer needs solving.

This, of course, is where most teams get the design thinking wrong. They don’t start with a goal – or if they do, most of the team aren’t involved at that point, and subsequently are not made aware of what the original goal or problem was. They’re just handed a list of features and told “build that”, with no real idea what it’s for.

But they should start with a goal.

In design workshops, I encourage teams to articulate the goal as a single, simple problem statement. e.g.,

It’s really hard to find good vegan takeaway in my area.

Jason Gorman, just now

Our goal is to make it easier to order vegan takeaway food. This, naturally, begs the question: how hard is it to order vegan takeaway today?

If our target customer area is Greater London, then at this point we need to hit the proverbial streets and collect data to help us answer that question. Perhaps we could pick some random locations – N, E, S and W London – and try to order vegan takeaway using existing solutions, like Google Maps, Deliveroo and even the Yellow Pages.

Our data set gives us some numbers. On average, it took 47 minutes to find a takeaway restaurant with decent vegan options. They were, on average, 5.2 miles from the random delivery address. The orders took a further 52 minutes to be delivered. In 19% of selected delivery addresses, we were unable to order vegan takeaway at all.

What I’ve just done there is apply a simple thought process known as Goal-Question-Metric.

We ask ourselves, which of these do we think we could improve on with a software solution? I’m not at all convinced software would make the restaurants cook the food faster. Nor will it make the traffic in London less of an obstacle, so delivery times are unlikely to speed up much.

But if our data suggested that to find a vegan menu from a restaurant that will deliver to our address we had to search a bunch of different sources – including telephone directories – then I think that’s something we could improve on. It hints strongly that lack of vegan options isn’t the problem, just the ease of finding them.

A single searchable list of all takeaway restaurants with decent vegan options in Greater London might speed up our search. Note that word: MIGHT.

I’ve long advocated that software specifications be called “theories”, not “solutions”. We believe that if we had a searchable list of all those restaurants we had to look in multiple directories for, that would make the search much quicker, and potentially reduce the incidences when no option was found.

Importantly, we can compare the before and the after – using the examples we pulled from the real world – to see if our solution actually does improve search times and hit rates.

Yes. Tests. We like tests.

Think about it; we describe our modern development processes as iterative. But what does that really mean? To me – a physics graduate – it implies a goal-seeking process that applies a process over and over to an input, the output of which is fed into the next cycle, which converges on a stable working solution.

Importantly, if there’s no goal, and/or no way of knowing if the goal’s been achieved, then the process doesn’t work. The wheels are turning, the engine’s revving, but we ain’t going anywhere in particular.

Now, be honest, when have you ever been involved in a design process that started like that? But this is where good design starts: with a goal.

So, we have a goal – articulated in a testable way, importantly. What next?

Next, we imaginate (or is it visionize? I can never keep up with the management-speak) a feature – a proverbial button the user clicks – that solves their problem. What does it do?

Don’t think about how it works. Just focus on visualifying (I’m getting the hang of this now) what happens when the user clicks that magical button.

In our case, we imagine that when the user clicks the Big Magic Button of Destiny, they’re shown a list of takeaway restaurants with a decent vegan menu who can deliver to their address within a specified time (e.g., 45 minutes).

That’s our headline feature. A headline feature is the feature that solves the customer’s problem, and – therefore – is the reason for the system to exist. No, “Login” is never a headline feature. Nobody uses software because they want to log in.

Now we have a testable goal and a headline feature that solves the customer’s problem. It’s time to think about how that headline feature could work.

We would need a complete list of takeaway restaurants with decent vegan menus within any potential delivery address in our target area of Greater London.

We would need to know how long it might take to deliver from each restaurant to the customer’s address.

This would include knowing if the restaurant is still taking orders at that time.

Our headline feature will require other features to make it work. I call these supporting features. They exist only because of the headline feature – the one that solves the problem. The customer doesn’t want a database. They want vegan takeaway, damn it!

Our simple system will need a way to add restaurants to the list. It will need a way to estimate delivery times (including food preparation) between restaurant and customer addresses – and this may change (e.g., during busy times). It will need a way for restaurants to indicate if they’re accepting orders in real time.

At this point, you may be envisaging some fancypants Uber Eats style of solution with whizzy maps showing delivery drivers aimlessly circling your street for 10 minutes because nobody reads the damn instructions these days. Grrr.

But it ain’t necessarily so. This early on in the design process is no time for whizzy. Whizzy comes later. If ever. Remember, we’re setting out here to solve a problem, not build a whizzy solution.

I’ve seen some very high-profile applications go live with data entry interfaces knocked together in MS Access for that first simple release, for example. Remember, this isn’t a system for adding restaurant listings. This is a system for finding vegan takeaway. The headline feature’s always front-and-centre – our highest priority.

Also remember, we don’t know if this solution is actually going to solve the problem. The sooner we can test that, the sooner we can start iterating towards something better. And the simpler the solution, the sooner we can put it in the hands of end users. Let’s face it, there’s a bit of smoke and mirrors to even the most mature software solutions. We should know; we’ve looked behind the curtain and we know there’s no actual Wizard.

Once we’re talking about features like “Search for takeaway”, we should be in familiar territory. But even here, far too many teams don’t really grok how to get from a feature to working code.

But this thought process should be ingrained in every developer. Sing along if you know the words:

  • Who is the user and what do they want to do?
  • What jobs does the software need to do to give them that?
  • What data is required to do those jobs?
  • How can the work and the data be packaged together (e.g., in classes)
  • How will those modules talk to each other to coordinate the work end-to-end?

This is the essence of high-level modular software design. The syntax may vary (classes, modules, components, services, microservices, lambdas), but the thinking is the same. The user has needs (find vegan takeaway nearby). The software does work to satisfy those needs (e.g., estimate travel time). That work involves data (e.g., the addresses of restaurant and customer). Work and data can be packaged into discrete modules (e.g., DeliveryTimeEstimator). Those modules will need to call other modules to do related work (e.g., address.asLatLong()), and will therefore need “line of sight” – otherwise known as a dependency – to send that message.

You can capture this in a multitude of different ways – Class-Responsibility-Collaboration (CRC) cards, UML sequence diagrams… heck, embroider it on a tapestry for all I care. The thought process is the same.

This birds-eye view of the modules, their responsibilities and their dependencies needs to be translated into whichever technology you’ve selected to build this with. Maybe the modules are Java classes. Maybe their AWS lambdas. Maybe they’re COBOL programs.

Here we should be in writing code mode. I’ve found that if your on-paper (or on tapestry, if you chose that route) design thinking goes into detail, then it’s adding no value. Code is for details.

Start writing automated tests. Now that really should be familiar territory for every dev team.

/ sigh /

The design thinking never stops, though. For one, remember that everything so far is a theory. As we get our hands dirty in the details, our high-level design is likely to change. The best laid plans of mice and architects…

And, as the code emerges one test at a time, there’s more we need to think about. Our primary goal is to build something that solves the customer’s problem. But there are secondary goals – for example, how easy it will be to change this code when we inevitably learn that it didn’t solve the problem (or when the problem changes).

Most kitchen designs you can cater a dinner party in. But not every kitchen is easy to change.

It’s vital to remember that this is an iterative process. It only works if we can go around again. And again. And again. So organising our code in a way that makes it easy to change is super-important.

Enter stage left: refactoring.

Half the design decisions we make will be made after we’ve written the code that does the job. We may realise that a function or method is too big or too complicated and break it down. We may realise that names we’ve chosen make the code hard to understand, and rename. We may see duplication that could be generalised into a single, reusable abstraction.

Rule of thumb: if your high-level design includes abstractions (e.g., interfaces, design patterns, etc), you’ve detailed too early.

Jason Gorman, probably on a Thursday

The need for abstractions emerges organically as the code grows, through the process of reviewing and refactoring that code. We don’t plan to use factories or the strategy pattern, or to have a Vendor interface, in our solution. We discover the need for them to solve problems of software maintainability.

By applying organising principles like Simple Design, D.R.Y. Tell, Don’t Ask, Single Responsibility and the rest to the code is it grows, good, maintainable modular designs will emerge – often in unexpected ways. Let go of your planned architecture, and let the code guide you. Face it, it was going to be wrong anyway. Trust me: I know.

Here’s another place that far too many teams go wrong. As your code grows and an architecture emerges, it’s very, very helpful to maintain a birds-eye view of what that emerging architecture is becoming. Ongoing visualisation of the software – its modules, patterns, dependencies and so on – is something surprisingly few teams do these days. Working on agile teams, I’ve invested some of my time to creating and maintaining these maps of the actual terrain and displaying them prominently in the team’s area – domain models, UX storyboards, key patterns we’ve applied (e.g., how have we done MVC?) You’d be amazed what gets missed when everyone’s buried in code, neck-deep in details, and nobody’s keeping an eye on the bigger picture. This, regrettably, is becoming a lost skill – the baby Agile threw out with the bathwater.

So we build our theoretical solution, and deliver it to end users to try. And this is where the design process really starts.

Until working code meets the real world, it’s all guesswork at best. We may learn that some of the restaurants are actually using dairy products in the preparation of their “vegan” dishes. Those naughty people! We may discover that different customers have very different ideas about what a “decent vegan menu” looks like. We may learn that our estimated delivery times are wildly inaccurate because restaurants tell fibs to get more orders. We may get hundreds of spoof orders from teenagers messing with the app from the other side of the world.

Here’s my point: once the system hits the real world, whatever we thought was going to happen almost certainly won’t. There are always lessons that can only be learned by trying it for real.

So we go again. And that is the true essence of software design.

When are we done? When we’ve solved the problem.

And then we move on to the next problem. (e.g., “Yeah, vegan food’s great, but what about vegan booze?”)

Will There Be A Post-Pandemic IT Boom?

For billions of people around the world, things are pretty uncertain now. Hundreds of millions have lost their jobs. Businesses of all sizes – but especially smaller and newer businesses, many start-ups – are in trouble. Many have already folded.

The experts predict a recession the likes of which we haven’t seen in anyone’s lifetime. But there may be one sector that – as the dust settles – might even grow faster as a result of the pandemic.

Information and communications technology has come to the fore as country after country locked down, commanding businesses who could to let their employees work from home. This would not have been possible a generation ago for the vast majority. Most homes did not have computers, and almost no homes had Internet. Now, it’s the reverse.

While some household name brands have run into serious difficulties, new brands have become household names in the last 3 months – companies like Zoom, for example. “Zooming” is now as much a thing as “hoovering”.

Meanwhile, tens of thousands of established businesses have had their digital transformations stress-tested for the first time, and have found them wanting. From extreme cases like UK retailer Primark, who effectively had no online capability, to old hands in every sector who’ve invested billions in digital over the last 30 years, it seems most were not quite as “digital” as it turns out they needed to be.

From customer-facing transactions to internal business processes, the pandemic has revealed gaps that were being filled by people necessarily co-located in offices and shops and factories and so on. A client of mine, for example, still doesn’t have the ability to sign up new suppliers without a human being in the accounts department to access the mainframe via one of the dedicated terminals. They are rushing now to close that gap, but their mainframe skills base has dwindled to the point where nobody knows how. So they have to hire someone with COBOL skills to make the changes on that side, and C# skills to write the web front end for it. Good luck with that!

I’m noticing these digital gaps everywhere. Most organisations have them. They were missed because the processes still worked, thanks to the magic of People Going To OfficesTM. But now those gaps have been laid bare for everyone to see (and for customers and suppliers to experience).

Here’s the thing: thing’s aren’t going back to normal. The virus is going to be with us for some time, and even after we’ve tamed it with a vaccine or new treatments, everyone will be thinking about the next new virus. Just as COVID-19 leaves it mark on people it infects, the pandemic will leave its mark on our civilisation. We will adapt to a new normal. And a big component of that new normal will be digital technology. As wars accelerate science and technology, so too will COVID-19 accelerate digital innovation.

And it’s a match made in heaven, because this innovation can largely be done from our homes, thanks to… digital technology! It’s a self-accelerating evolution.

So, I have an inkling we’re going to be very busy in the near future.

10 Things Every *Good* Software Development Method Does

I’ve been a programmer for the best part of four decades – three of them professionally – and, for the last 25 years, a keen student of this thing we call “software development”.

I’ve studied and applied a range of software development methods, principles, and techniques over those years. While, on the surface, Fusion may look different to the Unified Process, which may look different to Extreme Programming, which may look different to DSDM, which may look different to Cleanroom Software Engineering, when you look under the hood of these approaches, they actually have some fundamental things in common.

Here are the 10 things every software developer should know:

  1. Design starts with end users and their goals – be it with use cases, or with user stories, or with the “features” of Feature-Driven Development, the best development approaches drive their solution designs by first asking: Who will be using this software, and what will they be using it to do?
  2. Designs grow one usage scenario at a time – scenarios or examples drive the best solution designs, and those designs are fleshed out one scenario at a time to satisfy the user’s goal in “happy paths” (or to recover gracefully from not satisfying the user’s goal, which we call “edge cases”). Developers who try to consider multiple scenarios simultaneously tend to bite off more than they can chew.
  3. Solutions are delivered one scenario at a time – teams who deliver working software in end-to-end slices of functionality (e.g., the UI, business logic and database required to do a thing the user requires) tend to fare better than teams who deliver horizontal slices across their architecture (the UI components for all scenarios, and then the business logic, and then the database code). This is ffor two key reasons. Firstly, they can get user feedback from working features sooner, which speeds up the learning process. Secondly, if they only manage to deliver 75% of the software before a release date, they will have delivered 75% of end-to-end working features, instead of 75% of the layers of all features. We call this incremental delivery.
  4. Solutions evolve based on user feedback from increments – the other key ingredient in the way we deliver working software is how we learn from the feedback we get from end users in each increment of the software. With the finest requirements and design processes – and the best will in the world – we can’t expect to get it right first time. Maybe our solution doesn’t give them what they wanted. Maybe what they wanted turns out to be not what they really needed. The only way to find out for sure is to deliver what they asked for and let them take it for a spin. And then the feedback starts flooding in. The best approaches accept that feedback is not just unavoidable, it’s very desirable, and teams seek it out as often as possible.
  5. Plans change – if we can’t know whether we’re delivering the right software for sure until we’ve delivered it, then our approach to planning must be highly adaptable. Although the wasteland of real-world software development is littered with the bleached bones of “waterfall” projects that attempted to get it right first time (and inevitably failed), the idealised world of software development methods rejected that idea many decades ago. All serious methods are iterative, and all serious methods tell us that the plan will necessarily change. It’s management who resist change, not methods.
  6. Code changes – if plans change based on what we learn from end users, then it stands to reason that our code must also change to accommodate their feedback. This is the sticking point on many “agile” development teams. Their management processes may allow for the plan to change, but their technical practices (or the lack of them) may mean that changing the code is difficult, expensive and risky. There are a range of factors in the cost of changing software, but in the wider perspective, it essentially boils down to “How long will it take to deliver the next working iteration to end users?” If the answer is “months”, then change is going to be slow and the users’ feedback will be backed up like the LA freeway on a Monday morning. If it’s “minutes” then you can iterate very rapidly and learn your way to getting it right much faster. Delivery cycles are fundamental. They’re the metabolism of software development.
  7. Testing is fast and continuous – if the delivery cycle of the team is its metabolism, then testing is its thyroid. How long it takes to establish if our software’s broken will determine how fast our delivery cycle’ can be (if the goal is to avoid delivering broken software, of course.) If you aspire to a delivery cycle of minutes, then that leaves minutes to re-test your software. If all your testing’s done manually, then a modestly complex system will likely take weeks to re-test. And it’s a double whammy. Studies show that the longer a bug goes undetected, the exponentially greater it costs to fix it. If I break some code now and find out a minute from now, it’s a trifle to fix it. If I find out 6 weeks from now, it’s a whole other ball game. Teams who leave testing late typically end up spending most of their time fixing bugs instead of delivering valuable features and changes. All of this can profoundly impact delivery cycles and the cost of adapting to user feedback. Testing early and often is a feature of all serious methods. Automating our tests so they run fast is a feature of all the best methods.
  8. All work is undo-able – If we accept that its completely unrealistic to expect to get things right first time, then we must also accept that all the work we do is essentially an experiment from which we must learn. Sometimes, what we’ll learn is that what we’ve done is simply no good, and we need to do over. Software Configuration Management (of which version control is the central pillar) is a key component of all serious software development methods. A practice like Continuous Integration, done right, can bring us high levels of undo-ability, which massively reduces risk in what is a pretty risky endeavour. To use an analogy, think of software development as a multi-level computer game. Experienced gamers know to back up their place in the game frequently, so they don’t have to replay huge parts of it after a boo-boo. Same thing with version control and SCM. We don’t want our versions to be too far apart, or we’ll end up in a situation where we have to redo weeks or months of work because we took a wrong turn in the maze.
  9. Architecture is a process (not a person or a thing) – The best development methods treat software architecture and design as an ongoing activity that involves all stakeholders and is never finished. Good architectures are driven directly from user goals, ensuring that those goals are satisfied by the design above all else (e.g., use case realisations in the Unified Process), and applying organising principles – Simple Design, “Tell, Don’t Ask”, SOLID etc – to the internals of the solution design to ensure the code will be malleable enough to change to meet future needs. As an activity, architecture encompasses everything from the goals and tasks of end users, to the modular structure of the solution, to the everyday refactorings that are performed against code that falls short, the test suites that guard against regressions, the documentation that ships with the end product, and everything else which is informed by the design process. Since architecture is all-encompassing, all serious development methods mandate that it be a shared responsibility. The best methods strongly encourage a high level of architectural awareness within the team through continuous visualisation and review of the design. To some extent, everyone involved is defining the architecture. It is ever-changing and everyone’s responsibility.
  10. “Done” means we achieved the customer’s end goal – All of our work is for nothing if we don’t solve the problem we set out to solve. Too many teams are short-sighted when it comes to evaluating their success, considering only that a list of requested features was delivered, or that a product vision was realised. But all that tells us is that we administered the medicine. It doesn’t tell us if the medicine worked. If iterative development is a search algorithm, then it’s a goal-seeking search algorithm. One generation of working software at a time, we ask our end users to test the solution as a fit to their problem, learn what worked and what didn’t, and then go around again with an improved solution. We’re not “done” until the problem’s been solved. While many teams pay lip service to business goals or a business context, it’s often more as an exercise in arse-covering – “We need a business case to justify this £10,000,000 CRM system we’ve decided to build anyway!” – than the ultimate driver of the whole development process. Any approach that makes defining the end goal a part of the development process has put the cart before the horse. If we don’t have an end goal – a problem to be solved – then development shouldn’t begin. But all iterative development methods – and they’re all iterative to some degree – can be augmented with an outer feedback loop that considers business goals and tests working software in business situations, driving everything from there.

As a methodologist, I could spin you up an infinite number of software development methods with names like Goal-Oriented Object Delivery, or Customer Requirement Architectural Process. And, on the surface, I could make them all look quite different. But scratch the surface, and they’d all be fundamentally the same, in much the same way that programming languages – when you look past their syntax – tend to embrace the same underlying computing concepts.

Save yourself some time. Embrace the concepts.

Automate, Automate, Autonomy!

Thanks to pandemic-induced economic chaos, you’ve been forced to take a job on the quality assurance line at a factory that produces things.

The machine creates all kinds of random things, but your employer only sells a very specific subset of those things. All the things that don’t fit the profile have to be rejected, and melted down and fed back into the machine to make more things.

On your first day, you get training. (Oh, would that were true in software development!)

They stand you at the quality gate and start up the machine. All kinds of things come down the line at you. Your line manager tells you “Only let the green things through”. You grab all the things that aren’t green and throw them into the recycle bin. So far, so good.

“Only let the green round things through!” shouts your line manager. Okay, you think. Bit harder now. All non-green, non-round things go in the bin.

“Only let the green round small things through!” Now you’re really having to concentrate, a few green round small things end up in the bin, and a few non-green, non-round, non-small things get through.

“Only let the green round small things with Japanese writing on them through!” That’s a lot to process at the same time. Now your brain is struggling to cope. A bunch of blue and red things with Japanese writing on them get through. A bunch of square things get through. Your score has gone from 100% accurate to just 90%. Either someone will have to go through the boxes that have been packed and pick out all the rejects, or they’ll have to deal with 10% customer returns after they’ve been shipped.

“Only let the green round small things with Japanese writing on them that have beveled edges and a USB charging port on the opposite side to the writing and a power button in the middle of the writing and a picture of a horse  – not a donkey, mind, reject those ones! – and that glow in the dark through!”

Now it’s chaos. Almost every box shipped contains things that should have been thrown in the recycle bin. Almost every order gets returned. That’s just too much to process. Too many criteria.

We have several choices here:

  1. Slow down the line so we can methodically examine every thing against our checklist, one criteria at a time.
  2. Hire a whole bunch of people and give them one check each to do.
  3. Reset customer expectations about the quality of the things they’re buying.
  4. Automate the checking using cameras and robots and lasers and super-advanced A.I. so all those checks can be made at production speed to a high enough accuracy.

Number 4 is the option that potentially gives us the win-win of customer satisfaction and high productivity without the bigger payroll. It’s been the driving force behind the manufacturing revolutions in East Asia for the last 70 years: automate, automate, automate.

But it doesn’t come for free. High levels of automation require considerable ongoing investment in time, technology and training. In the UK, we’ve under-invested, becoming more and more inefficient and expensive while the quality of our output has declined. Shareholders want their return now. There’s no appetite for making improvements for the future.

There are obvious parallels in software development. Businesses want their software now. Most software organisations have little inclination to invest the time, technology and training required to reach the high levels of automation needed to achieve the coveted continuous delivery that would allow them to satisfy customer needs sooner, cheaper, and for longer.

The inescapable reality is that frictionless delivery demands an investment of 20-25% of your total software development budget. To put it more bluntly, everyone should be spending 1 day a week not on immediate customer requirements, but on making improvements in the delivery process that would mean meeting future customer requirements will be easier.

And so, for most teams, it never gets easier. The software just gets buggier, later and more expensive year after year.

What distinguishes those software teams who are getting it right from the rest? From observation, I’ve seen the same factor every time: autonomy. Teams will invest that 20-25% when it’s their choice. They’re tasked with delivering value, and allowed to figure out how best to do that. Nobody’s telling them how to do their jobs.

How did this blissful state come about? Again, from observation, those teams have autonomy because they took it. Freedom is rarely willingly given.

Now, I appreciate this is a whole can of worms. To take their autonomy, teams need to earn trust. The more trust a team has earned, the more likely they’ll be left alone. And this can be a chicken and egg kind of situation. To earn trust, the team has to reliably deliver. To reliably deliver, the team needs autonomy. This whole process must begin with a leap of faith on the business’s part. In other words, they have to give teams the benefit of the doubt long enough to see the results.

And here come the worms… Teams have to win over their customer from the start, before anything’s been delivered – before the customer’s had a chance to taste our pudding. This means that developers need to inspire enough confidence with their non-technical stakeholders – remember, this is a big money game – to reassure everyone that they’re in good hands. And we’re really, really bad at this.

The temptation is to over-promise, and set unrealistic expectations. This pretty much guarantees disappointment. The best way to inspire confidence is to have a good track record. No lawyer can guarantee to win your case. But a lawyer who won 9 of their last 10 cases is going to inspire more confidence than a lawyer who’s taking this as their first case promising you a win.

And we’re really, really bad at this, too – chiefly because software development teams are newly formed for that specific piece of work and don’t have a track record to speak of. Sure, individual developers may be known quantities, but in software, the unit of delivery is the team. I’ve watched teams of individually very strong developers fall flat on their collective arse.

And this is why I believe that this picture won’t change until organisations start to view teams as assets, and invest in them for a long-term pay-off as well as short-term delivery, 20/80. And, again, I don’t think this will be willingly given. So maybe we – as a profession – need to take the decision out of their hands.

It could all start with one big act of collective autonomy.

 

 

Why COBOL May Be The Language In Your Future

Yes, I know. Preposterous! COBOL’s 61 years old, and when was the last time you bumped into a COBOL programmer still working? Surely, Java is the new COBOL, right?

Think again. COBOL is alive and well. Some 220 billion lines of it power 71% of Fortune 500 companies. If a business is big enough and been around long enough, there’s a good chance the lion’s share of the transactions you do with that business involve some COBOL.

Fact is, they’re kind of stuck with it. Mainframe systems represent a multi-trillion dollar investment going back many decades. COBOL ain’t going nowhere for the foreseeable future.

What’s going is not the language but the programmers who know it and who know those critical business systems. The average age of a COBOL programmer in 2014 was 55. No doubt in 2020 it’s older than that, as young people entering IT aren’t exactly lining up to learn COBOL. Colleges don’t teach it, and you rarely hear it mentioned within the software development community. COBOL just isn’t sexy in the way Go or Python are.

As the COBOL programmer community edges towards statistical retirement – with the majority already retired (and frankly, dead) – the question looms: who is going to maintain these systems in 10 years or 20 years time?

One thing we know for sure: businesses have two choices – they can either replace the programmers, or replace the programs. Replacing legacy COBOL systems has proven to be very time-consuming and expensive for some banks. Commonwealth Bank of Australia took 5 years and $750 million to replace its core COBOL platform in 2012, for example.

And to replace a COBOL program, developers writing the new code at least need to be able to read the old code, which will require a good understanding of COBOL. There’s no getting around it: a bunch of us are going to have to learn COBOL one way or another.

I did a few months of COBOL programming in the mid-1990s, and I’d be lying if I said I enjoyed it. Compared to modern languages like Ruby and C#, COBOL is clunky and hard work.

But I’d also be lying if I said that COBOL can’t be made to work in the context of modern software development. In 1995, we “version controlled” our source files by replacing listings in cupboards. We tested our programs manually (if we tested them at all before going live). Our release processes were effectively the same as editing source files on the live server (on the mainframe, in this case).

But it didn’t need to be like that. You can manage versions of your COBOL source files in a VCS like Git. You can write unit tests for COBOL programs. You can do TDD in COBOL (see Exhibit A below).

You can refactor COBOL code (“Extract Paragraph”, “Extract Program”, “Move Field” etc), and you can automate a proper build an release process to deploy changed code safely to a mainframe (and roll it back if there’s a problem).

It’s possible to be agile in COBOL. The reason why so much COBOL legacy code fails in that respect has much more to do with decades of poor programming practices and very little to do with the language or the associated tools themselves.

I predict that, as more legacy COBOL programmers retire, the demand – and the pay – for COBOL programmers will rise to a point where some of you out there will find it irresistible.  And the impact on society if they can’t be found will be severe.

The next generation of COBOL programmers may well be us.

Is Your Agile Transformation Just ‘Agility Theatre’?

I’ve talked before about what I consider to be the two most important feedback loops in software development.

When I explain the feedback loops – the “gears” – of Test-Driven Development, I go to great pains to highlight which of those gears matter most, in terms of affecting our odds of success.

tdd_gears

Customer or business goals drive the whole machine of delivery – or at least, they should. We are not done because we passed some acceptance tests, or because a feature is in production. We’re only done when we’ve solved the customer’s problem.

That’s very likely going to require more than one go-around. Which is why the second most important feedback loop is the one that establishes if we’re good to go for the next release.

The ability to establish quickly and effectively if the changes we made to the software have broken it is critical to our ability to release it. Teams who rely on manual regression testing can take weeks to establish this, and their release cycles are inevitably very slow. Teams who rely mostly on automated system and integration tests have faster release cycles, but still usually far too slow for them to claim to be “agile”. Teams who can re-test most of the code in under a minute are able to release as often as the customer wants – many times a day, if need be.

The speed of regression testing – of establishing if our software still works – dictates whether our release cycles span months, weeks, or hours. It determines the metabolism of our delivery cycle and ultimately how many throws of the dice we get at solving the customer’s problem.

It’s as simple as that: faster tests = more throws of the dice.

If the essence of agility is responding to change, then I conclude that fast-running automated tests lie at the heart of that.

What’s odd is how so many “Agile transformations” seem to focus on everything but that. User stories don’t make you responsive to change. Daily stand-ups don’t make you responsive to change. Burn-down charts don’t make you responsive to change. Kanban boards don’t make you responsive to change. Pair programming doesn’t make you responsive to change.

It’s all just Agility Theatre if you’re not addressing the two must fundamental feedback loops, which the majority of organisations simply don’t. Their definition of done is “It’s in production”, as they work their way through a list of features instead of trying to solve a real business problem. And they all too often under-invest in the skills and the time needed to wrap software in good fast-running tests, seeing that as less important than the index cards and the Post-It notes and the Jira tickets.

I talk often with managers tasked with “Agilifying” legacy IT (e.g., mainframe COBOL systems). This means speeding up feedback cycles, which means speeding up delivery cycles, which means speeding up build pipelines, which – 99.9% of the time – means speeding up testing.

After version control, it’s #2 on my list of How To Be More Agile. And, very importantly, it works. But then, we shouldn’t be surprised that it does. Maths and nature teach us that it should. How fast do bacteria or fruit flies evolve – with very rapid “release cycles” of new generations – vs elephants or whales, whose evolutionary feedback cycles take decades?

There are two kinds of Agile consultant: those who’ll teach you Agility Theatre, and those who’ll shrink your feedback cycles. Non-programmers can’t help you with the latter, because the speed of the delivery cycle is largely determined by test execution time. Speeding up tests requires programming, as well as knowledge and experience of designing software for testability.

70% of Agile coaches are non-programmers. A further 20% are ex-programmers who haven’t touched code for over a decade. (According to the hundreds of CVs I’ve seen.) That suggests that 90% of Agile coaches are teaching Agility Theatre, and maybe 10% are actually helping teams speed up their feedback cycles in any practical sense.

It also strongly suggests that most Agile transformations have a major imbalance; investing heavily in the theatre, but little if anything in speeding up delivery cycles.

How The Way We Measure “Productivity” Makes Us Take Bad Bets

Encouraging news from Oxford as researcher Sarah Gilbert says she’s “80% confident” the COVID-19 vaccine her team has been testing will work and may be ready by the autumn.

Except…

As a software developer, the “80%” makes my heart sink. I know from bitter experience that 80% Done on solving a problem is about as meaningless a measure as you can get. The vaccine will either solve the problem – allowing us to get back to normal – or it won’t.

In software development, I apply similarly harsh logic. Teams may tell me that they’re “80% done” when they mean “We’ve built 80% of the features” or “We’ve written 80% of the code”. More generally: “We’re 80% through a plan”.

A plan is not necessarily a solution. Several promising vaccines are undergoing human trials as we speak, though. So, while Gilbert’s 80% Done may eventually turn out to be the 20% Not Done after more extensive real-world testing, there are enough possible solutions out there to give me hope that a vaccine will be forthcoming within a year.

Think of “80% done” as a 4/5 chance that it’ll work. There are several 4/5 chances – several rolls of the dice, which give the world cumulatively pretty good odds. Bill Gate’s plan to build multiple factories and start manufacturing multiple vaccines before the winner’s been identified will no doubt speed things up. And there are more efforts going on around the world if those all fail.

Software, not so much. Typically, a software development effort is the only game in town – all the eggs in a single basket, if you like. And this has always struck me as irrational behaviour on the part of organisations. At best, the design of a solution is complete guesswork as to whether or not it will solve the customer’s problem. It’s a coin toss. But a lot of organisations plan just to toss a single coin, and only once. Two coin tosses would give them 75% odds. 3 would give them 87.5%. 4 would give them 93.75%. And so on.

It’s more complex than that, of course. In real life, there’s significant odds that we’re barking up completely the wrong tree. We can’t fix a fundamentally flawed solution by refining it. So iterating only helps when we’re in the ballpark to begin with.

Software solutions – to have the best odds of succeeding in the real world – need to start with populations of possible solutions, just like the COVID-19 solution starts with a population of potential vaccines. If there was only one team working on one vaccine, I’d be very, very worried right now.

Smart organisations – of which there are sadly very few, it would seem – start projects by inviting teams or individuals to propose solutions. The most promising of those are given a little bit if budget to develop further, so they can at least go through some preliminary testing with small groups of end users. These Minimum Viable Products are whittled down further to the most promising, and more budget is assigned to evolve them into potential products. Eventually, one of those products will win out, and the rest of the budget is assigned to that to take it to market (which could mean rolling it out as an internal system into the business, if that’s the kind of problem being solved.)

We know from decades of experience and some big studies that the bulk of the cost of software is downstream. For example, my team was given a budget of £200,000 to develop a job site in the late 90s. The advertising campaign for the site’s launch cost £4 million. The team of sales, marketing people and admin people who ran the site cost £2.5 million a year. The TCO of the software itself was about £2.8 million over 5 years.

Looking back, it seems naive in the extreme that the business went with the first and only idea for the design of the site that was presented to them, given the size of the bet they were about to place. (Even more naive that the design was presented as a database schema, with no use cases – but that’s a tale for another day.)

Had I been investing that kind of money, I would have spent maybe £10,000 each on the four most promising prototypes – assigning one pair of developers to each one. After 2 weeks, I would have jettisoned the two least promising – based on real end user testing – and merged the fallow pairs into two teams, then given them £40,000 each for further development. After another 4 weeks, and more user testing, I would have selected the best of the two, merged the two teams into one, and invested the remaining £80,000 in four more weeks of development to turn it into a product.

Four throws of the dice buys you a 93.75% chance of success. Continuous user feedback on the most promising solution raises the odds even further.

But what most managers hear when I say “Start with 8 developers working on 4 potential solutions” is WASTE. 75% of the effort in the first two weeks is “wasted”. 50% of the effort in the next 4 weeks is “wasted”. The total waste is only 27.5%, though – measured in weeks spent by developers on software that ultimately won’t get used.

Three quarters of the time invested is devoted to the winning solution. That’s in exchange for much higher odds of success. If we forecasted waste by time spent multiplied by odds of failure, then having all 8 developers work on a single possible solution – a toss of a coin – presents a risk of wasting 40 weeks of developer time (or half our budget).

Starting with 4 possible solutions uses the exact same amount of developer time and budget for a 93.75%+ chances of succeeding. Risk of waste is actually – in real terms – only 6.25% of that total budget, even though we know that a quarter of the software written won’t get used.

But that’s only if you measure waste in terms of problems not solved instead of software not delivered.

The same investment: £200,000. Starkly different odds of success. Far lower risk of wasting time and money.

And that’s just the money we spent on writing the software. Now think about my job site. Many millions more were spent on the business operation that was built on that software. Had we delivered the wrong solution  – spoiler alert: we had – then that’s where the real waste would be.

Focusing on solving problems makes us more informed gamblers.

 

Why I Abandoned Business Modeling

So, as you may have gathered, I have a background in model-driven processes. I drank the UML Kool-Aid pretty early on, and by 2000 was a fully paid-up member of the Cult of Boxes And Arrows Solve Every Problem.

The big bucks for us architect types back then – and, probably still – came with a job title called Enterprise Architect. Enterprise Architecture is built on the idea that organisations like businesses are essentially machines, with moving connected parts.

Think of it like a motor car; there was a steering wheel, which executives turn to point the car in the direction they wanted to go. This was connected through various layers of mechanisms – business processes, IT systems, individual applications, actual source code – and segregated into connected vertical slices for different functions within the business, different business locations and so on.

The conceit of EA was that we could connect all those dots and create strategic processes of change where the boss changes a business goal and that decision works its way seamlessly through this multi-layered mechanism, changing processes, reconfiguring departments and teams, rewriting systems and editing code so that the car goes in the desired new direction.

It’s great fun to draw complex picture of how we think a business operates. But it’s also a fantasy. Businesses are not mechanistic or deterministic in this way at all. First of all, modeling a business of any appreciable size requires us to abstract away all the insignificant details. In complex systems, though, there are no such things as “insignificant details”. The tiniest change can push a complex system into a profoundly different order.

And that order emerges spontaneously and unpredictably. I’ve watched some big businesses thrown into chaos by the change of a single line of code in a single IT system, or by moving the canteen to a different floor in HQ.

2001-2003 was a period of significant evolution of my own thinking on this. I realised that no amount of boxes and arrows could truly communicate what a business is really like.

In philosophy, they have this concept of qualia – individual instances of subjective, conscious experience. Consider this thought experiment: you’re locked in a tower on a remote island. Everything in it is black and white. The tower has an extensive library of thousands of books that describe everything you could possibly need to know about the colour orange. You have studied the entire contents of that library, and are now the world’s leading authority on orange.

Then, one day, you are released from your tower and allowed to see the world. The first thing you do, naturally, is go and find an orange. When you see the colour orange for the first time – given that you’ve read everything there is to know about it – are you surprised?

Two seminal professional experiences I had in 2002-2004 convinced me that you cannot truly understand a business without seeing and experiencing it for yourself. In both cases, we’d had teams of business analysts writing documents, creating glossaries, and drawing boxes and arrows galore to explain the organisational context in which our software was intended to be used.

I speak box-and-arrow fluently, but I just wasn’t getting it. So many hidden details, so many unanswered questions. So, after months of going round in circles delivering software that didn’t fit, I said “Enough’s enough” and we piled into a minibus and went to the “shop floor” to see these processes for ourselves. The mist cleared almost immediately.

Reality is very, very complicated. All we know about conscious experience suggests that our brains are only truly capable of understanding complex things from first-hand experience of them. We have to see them and experience them for ourselves. Accept no substitutes.

Since then, my approach to strategic systems development has been one of gaining first-hand experience of a problem, and trying simple things we believe might solve those problems, seeing and measuring what effect they have, and feeding back into the next attempt.

Basically, I replaced Enterprise Architecture with agility. Up to that point, I’d viewed Agile as a way of delivering software. I was already XP’d up to the eyeballs, but hadn’t really looked beyond Extreme Programming to appreciate its potential strategic role in the evolution of a business. There have to be processes outside of XP that connect business feedback cycles to software delivery cycles. And that’s how I do it (and teach it) now.

Don’t start with features. Start with a problem. Design the simplest solution you can think of that might solve that problem, and make it available for real-world testing as soon as you can. Observe (and experience) the solution being used in the real world. Feed back lessons learned and go round again with an evolution of your solution. Rinse and repeat until the problem’s solved (my definition of “done”). Then move on to the next problem.

The chief differences between Enterprise Architecture and this approach are that:

a. We don’t make big changes. In complex adaptive systems, big changes != big results. You can completely pull a complex system out of shape, and over time the underlying – often unspoken – rule of the system (the “insignificant details” your boxes and arrows left out, usually) will bring it back to its original order. I’ve watched countless big change programmes produce no lasting, meaningful change.

b. We begin and end in the real world

In particular, I’ve learned from experience that the smallest changes can have the largest impact. We instinctively believe that to effect change at scale, we must scale our approach. Nothing could be further from the truth. A change to a single line of code can cause chaos at airport check-ins and bring traffic in an entire city to a standstill. Enterprise Architecture gave us the illusion of control over the effects of changes, because it gave us the illusion of understanding.

But that’s all it ever was: an illusion.

Is UML Esperanto for Programmers?

Back in a previous life, when I wore the shiny cape and the big pointy hat of a software architect, I thought the Unified Modeling Language was a pretty big deal. So much, in fact, that for quite a few years, I taught it.

In 2000, there was demand for that sort of thing. But by 2006 demand for UML training – and for UML on teams – had faded away to pretty much nothing. I rarely see it these days, on whiteboards or on developer machines. I occasionally see the odd class diagram or sequence diagram, often in a book. I occasionally draw the odd class diagram or sequence diagram myself – maybe a handful of times a year, when the need arises to make a point that such diagrams are well-suited to explaining.

UML is just one among many visualisation tools in my paint box. I use Venn diagrams when I want to visualise complex rules, for example. I use tables a lot – to visualise how functions should respond to inputs, to visualise state transitions, and to visualise conditional logic (e.g., truth tables). But we fixated on just that one set of diagrams, until UML became synonymous with software visualisation.

I’m a fan of pictures, you see. I’m a very visual thinker. But I’m aware that visual thinkers seem to be in a minority in computing. I often find myself being the only one in the room who gets it when they see a picture. Many programmers want to see code. So, on training courses now, I show them code, and then they get it.

Although UML has withered away, its vestigial limb remains in the world of academia. A lot of universities teach it, and in significant depth. In Computer Science departments around the world, Executable UML is still very much a thing and students may spend a whole semester learning how to specify systems in UML.

Then they graduate and rarely see UML again – certainly not Executable UML. The ones who continue to use it – and therefore not lose that skill – tend to be the ones who go on to teach it. Teaching keeps UML alive in the classroom long after it all but died in the office.

My website parlezuml.com still gets a few thousand visitors every month, and the stats clearly show that the vast majority are coming from university domains. In industry, UML is as dead a language as Latin. It’s taught to people who may go on to teach it, and elements of it can be found in many spoken languages today. (There were a lot of good ideas in UML). But there’s no country I can go to where the population speak Latin.

Possibly a more accurate comparison to UML,  might be Esperanto. Like UML, Esperanto was created – I think perhaps, aside from Klingon, only one example of a completely artificial spoken language – in an attempt to unify people and get everyone “speaking the same language”. As noble a goal as that may be, the reality of Esperanto is that the people who can speak it today mostly speak it to teach it to people who may themselves go on to teach it. Esperanto lives in the classroom – my Granddad Ray taught it for many years – and at conferences for enthusiasts. There’s no country I can go to where the population speak it.

And these days, I visit vanishingly few workplaces where I see UML being used in anger. It’s the Esperanto of software development.

I guess my point is this: if I was studying to be an interpreter, I would perhaps consider it not to be a good use of my time to learn Esperanto in great depth. For sure, there may be useful transferable concepts, but would I need to be fluent in Esperanto to benefit from them?

Likewise, is it really worth devoting a whole semester to teaching UML to students who may never see it again after they graduate? Do they need to be fluent in UML to learn its transferable lessons? Or would a few hours on class diagrams and sequence diagrams serve that purpose? Do we need to know the UML meta-meta-model to appreciate the difference between composition and aggregation, or inheritance and implementation?

Do I need to understand UML stereotypes to explain the class structure of my Python program, or the component structure of my service-oriented architecture? Or would boxes and arrows suffice?

If the goal of UML is to be understood (and to understand ourselves), then there are many ways beyond UML. How much of the 794-page UML 2.5.1 specification do I need to know to achieve that goal?

And why have they still not added Venn diagrams, dammit?! (So useful!)

So here’s my point: after 38 years programming – 28 of them for money – I know what skills I’ve found most essential to my work. Visualisation – drawing pictures – is definitely in that mix. But UML itself is a footnote. It beggars belief how many students graduate having devoted a lot of time to learning an almost-dead language but somehow didn’t find time to learn to write good unit tests or to use version control or to apply basic software design principles. (No, lecturers, comments are not a design principle.)

Some may argue that such skills are practical, technology-specific and therefore vocational. I disagree. There’s JUnit. And then there’s unit testing. I apply the same ideas about test structure, about test code design, about test organisation and optimisation in RSpec, in Mocha, in xUnit.net etc.

And UML is a technology. It’s an industry standard, maintained by an industry body. There are tools that apply – some more loosely than others – the standard, just like browsers apply W3C standards. Visual modeling with UML is every bit as vocational as unit testing with NUnit, or version control with Git. There’s an idea, and then that idea is applied with a technology.

You may now start throwing the furniture around. Message ends.

For Distributed Teams, Code Craft is Critical

Right now, most software teams all around the world are working from home. Many have not done it before, and are on a learning curve that means last week’s productivity won’t be returning for a while.

I’ve worked on distributed teams many times, and – through Codemanship – trained and mentored dozens of teams remotely. One thing I’ve learned from all that remote development experience is that coding discipline becomes super-important.

Just as distributed systems amplify every design flaw, turning what would be a headache in a monolith into a major outbreak in a service-oriented architecture, distributed working amplifies team dysfunctions as the communication pathways take on extra weight.

Here’s how code craft can help:

  • Unit tests – keeping the software working is Distributed Dev Team 101. Open Source projects rely on suites of fast-running tests to protect against check-ins that break the code.
  • Continuous Integration – is how distributed teams communicate their changes to each other. Co-located teams should merge their changes to the master branch often and be build aware, keeping one eye on other people’s merges to see what’s changed. But it’s much easier on co-located teams to keep everyone in step because we can see and talk to each other about the changes we’re making. If remote developers do infrequent large merges, integration hell gets amplified tenfold by the extra communication barriers.
  • Test-Driven Development – a lot of the communication between developers, and between developers and their customers, can be handwavy and vague. And if communication is easy – like on a co-located team – we just go around a few more times until we converge on what’s required. But when communication is harder, like in distributed teams, a few more goes around gets very expensive. Using executable tests as specifications removes the ambiguity. It should do exactly this. Also, TDD – done well – produces suites of useful, fast-running automated tests. It’s a win-win.
  • Design Principles – Well-factored code is very important to co-located teams, and super-duper-important to distributed teams. Let’s count the ways:
    • Simple Design
      • Code should work – if it don’t work, we can’t ship it. Any changes that break the code block the team. It’s a big deal on a co-located team, but it’s a really big deal on a distributed team.
      • Code should clearly communicate its intent – code should speak for itself, and when developers are working remotely, and communicating requires extra effort, this is especially true. The easier code is to understand, the less teleconferences required to understand it.
      • Code should be free of duplication – so much duplication in software is duplication of concepts. This often occurs when developers on teams work in isolation, unaware that someone else has already added a module that does what their module also does. Devs need to be aware of duplication in the code – Continuous Integration and merge awareness helps – and clued up to when they should refactor it and when they should leave it alone.
      • Code should be as simple as we can make it – every line of code that has to be maintained as another straw on the camel’s back. When the camel’s back stretches between multiple locations – possibly in multiple time zones – the impact of every additional straw is felt many-fold.
    • Modular Design
      • Modules should do one job – the ability to change the behaviour of a system by just editing one module is critical to a team’s ability to make the changes they need without treading on the toes of other developers. On distributed teams, multiple developers all making changes to one module for multiple reasons can lead to some spectacular merge train wrecks.
      • Modules should hide their internal workings – the more modules are coupled to each other, the bigger and wider the impact of even the smallest changes will be felt. Imagine your distributed team is working precariously balanced on high wires that are all interconnected. What you don’t want is for one person to start violently shaking their wire, sending ripples throughout the network. Or it could all come tumbling down. Again, it’s bad on co-located teams, but it’s Double-Plus-Triple-Word-Score-Bad on distributed teams. Ever dependency can bring pain.
      • Modules should not depend directly on implementations of other modules – it’s good architecture generally for modules not to bind directly to implementations of the other modules they use, for a variety of reasons. But it’s especially important when teams aren’t co-located. Taken together, the first three principles of modular design are better known as “Separation of Concerns”. Or, as I like to call it, the Principle of Somebody Else’s Problem. If my module needs to send an email, I shouldn’t need to know how emails are actually sent – all that detail should be hidden from me – and I should be able to work on my code without having to actually send emails when I test it. Sending emails is somebody else’s problem. It’s particularly useful in a test-driven approach to design to be able to write a test for code that has external dependencies – things it uses that other developers are working on – without actually binding directly to the implementation of that external component so that we can swap in a test double that pretends to do that job. That’s how you scale TDD. That’s how you make TDD work in distributed teams, too.
      • Module interfaces should be designed from the client’s point of view – tied together with TDD, we can specify modules very precisely from the outside: this is what it should look like (interface) and this is what it should do (tests). Imagine your distributed team is making a jigsaw: the hard way to do it is to have each person go off and make a piece of the jigsaw and then hope that they all fit together at the end. The smart way to do it is to define the shapes of the pieces as parts of the whole puzzle, and then have people implement the pieces based in the interfaces and tests agreed. You do this by designing systems from the outside in, defining modules by how they will be used from the client code’s POV. This also helps to restrict public interfaces to only what client’s need to see, hiding internal details, improving encapsulation and reducing coupling. Coupling on distributed teams can be very, very expensive.
    • Refactoring – the still-rather-too-rare discipline of reshaping code without breaking the software is the means by which we achieve good design. Try as we might to never write code that’s hard to understand, or has duplication, or is overly complex, or too tightly coupled, we’ll always need to clean up our code as we go. If the impact of poor design is amplified on distributed teams, the importance of refactoring must be proportionally amplified. The alternative is relying on after-the-fact code reviews (e.g., in GitFlow), which will become multiple times the bottleneck they already were when your team was co-located and you could just pop over to Mary’s desk and ask.

Underpinning all of this is a need for levels of delivery process automation – automated testing, automated builds, automated deployments, automated code reviews – that the majority of teams are nowhere near.

And then there’s the interpersonal: the communication, the coordination, the planning and tracking, the collaborative design. It takes a big investment to make a distributed Agile team as productive as a co-located team.

All the Jiras and GitHubs and cloud-based build pipelines and remote whiteboards and shared IDEs and Zoom meetings in the world won’t save you if the code craft isn’t up to snuff, though. It’s foundational to delivering as a distributed team.

If you want to know more about code craft, visit www.codemanship.com