It’s About The Loops, Stupid!

2 essential models of software development:

1. Queuing – value “flows”, software is “delivered” etc. This is the “incremental” in “iterative & incremental”

2. Learning – teams converge on a working solution by iterating a design

The 1st model is fundamentally wrong & damaging.

The queue is the pipeline that takes the idea in our heads and turns it into working software users can try for real. But over-emphasis on that queue tends to obscure that it’s a loop feeding directly back into itself with better ideas.

And when we examine it more closely, considering the technical practices of software delivery, we see it’s actually lots of smaller interconnected loops (e.g., customer tests, red-green-refactor, CI etc) – wheels within wheels.

But our obsession with the pipeline tends to disguise this truth and frames all discussions in terms of queues instead of loops. Hence most teams mistake “delivered” with “done” and ignore the most important thing – feedback.

Psychologically, the language we use to describe what we do has a sort of queue/flow bias. And hence most “agile” teams just end up working their way incrementally through a thinly-disguised waterfall plan (called a “backlog”, or a “roadmap”).

They’re the workers in a parcel loading bay. They’re just loading “parcels” (features) on to a loading bay (dev) and then into a truck (ops).

Most dev teams have little or no visibility of what happens after those parcels have been “delivered”. What they don’t see at the other end is the customer trying to make something work with what’s in the parcels. Maybe they’re building a rocket, but we keep sending them toasters.

This mismatch in goals/motivations – “deliver all the toasters” vs “build a rocket” – is how the working relationship between dev teams and customers usually breaks down. We should all be trying to build the rocket.

It took humans decades (and hundreds of thousands of people) to learn how to build a rocket, with a lot of science and engineering (guesswork, basically) but mostly a lot of trial and error. We must learn with our customer what really needs be in the parcels we deliver.

The Xmas Train Kata

polar_express_train_christmas_train_little_girl_christmas_polar_winter_holiday-1162035 (2)

It’s been a while since I set a programming challenge with a seasonal theme – about a year, in fact – so here’s a new one which I hope you’ll enjoy, even if you don’t celebrate Xmas.

Imagine we run a railway that Santa and his helpers use to ship presents from their factory in Lapland non-stop to a distribution depot 860 km away just outside Helsinki. (That’s how it works. Don’t argue.)

The elves working at the depot need to know what time to expect the train so they can make sure everything’s ready at their end – reindeer fed, sleigh oiled, Santa’s lunchbox packed, etc. To aid in improving the accuracy of their ETA, they have commissioned us to write a software system.

Positioned at intervals of about 1 km, they’ve placed sensors along the track that send a signal to the depot as the train passes. There are contact points at the front and the rear of the train, so each passing of a sensor triggers 2 data messages – front and rear. The train is 200 m long.

Included in each data message is location information about that particular sensor telling us how far along the track from the factory in Lapland it is in kilometres.

The elves, of course, insist on a JSON format for these data messages, which looks like this:

First Message

{
    'passing': {
        'datetime': '2019-12-24T19:37:46.854Z',
        'contact': 'front',
        'distance': '183.449 km'
    }
}

Second Message

{
    'passing': {
        'datetime': '2019-12-24T19:37:52.229Z',
        'contact': 'rear',
        'distance': '183.449 km'
    }
}

Our software must determine the time that has elapsed between the first and second message, and use that information to calculate the speed of the train as it passed that sensor, and from there calculate an estimated time of arrival at the depot to the nearest minute.

This estimate will be updated with every new pair of messages sent by each sensor along the track to give the elves a live picture of the progress of the train.

Although in real life the train would need to vary its speed depending on conditions, for this exercise assume the train accelerates from rest at a constant 1 m/s/s and decelerates at the same rate, and has a maximum speed of 150 km/h.

For this exercise, you will need to create two programs:

  • One that simulate’s the train’s journey, sending pairs of sensor-passing data messages at intervals approximating every kilometre of the train’s progress
  • Another to receive these messages and update the train’s estimated time of arrival and display it on the screen

The sensors will send one final message when the train has reached the depot to let the receiver know the journey has ended.

Best of luck. The elves are counting on you!

Does The Means Justify The End?

“Java isn’t truly OO”, “Our team isn’t truly Agile”, “We’re not really doing microservices”, “This isn’t strictly Continuous Delivery”. On a daily basis I hear people lamenting how some tool or technology or technique they’re using isn’t Pure, Undiluted Something.

I feel this can lose sight of why we use these tools and do these things in the first place. It’s not to say these things don’t matter – if you want your code to be shippable whenever the business wants, then that’s Continuous Delivery, for example – but I try to always start with the ‘why’ and not the ‘what’.

I think it’s another symptom of our overriding solutions focus; we tend to set out to build solutions rather than solve problems, and I see this extending beyond the software we create to the way we create it. Organisations set out to “move to the Cloud”, to “adopt Agile”, to “become service-oriented”, and so on. The effect of starting with the answer and working your way back to a question is that teams frequently end up inventing a question that fits that answer, and neglecting very real questions that their organisations face.

Take object oriented programming: when OO purists talk about programming languages, they often cite reusability as a goal. I encourage them to take a step back. Look at your organisation, look at its problems, its challenges and its opportunities – in that context, which of those problems will be solved by increased reuse of code? Would Kodak still be a world-leader in photography if they’d reused more of their code? Would Blockbuster still be here if they’d reused more of their code? Are WeWork laying off workers because they didn’t reuse more of their code?

That’s not to say that code reuse is never a real problem for any organisation. Companies setting out to create code that will be reused – libraries and APIs etc – would benefit from making their code easier to reuse, but is that your company? Is that a problem they face?

And I’m yet to be convinced that reusability is exclusively an OO, FP, CBD, SOA or any other such thing. C, COBOL and FORTRAN libraries have been very widely reused, for example. I suspect making them useful is more important than encapsulation or polymorphism.

For many organisations, software reuse is a red herring. And yet it still dominates as a concept in many IT departments.

Likewise with service-oriented architecture and microservices: “Oh, it’s much more scalable”, they tell me. I’m sure your system’s 80 users will be very relieved to hear that.

I continue to urge teams to solve problems instead of leading with the Solution du Jour. Maybe microservices are the answer. Maybe Agile is the answer. Maybe OOP is the answer. But the answer to what?

Start there.

Changing Legacy Code Safely

One of the topics we cover on the Codemanship TDD course is one that developers raise often: how can we write fast-running unit tests for code that’s not easily unit-testable? Most developers are working on legacy code – code that’s difficult and risky to change – most of the time. So it’s odd there’s only one book about it.

I highly recommend Micheal Feather’s book for any developer working in any technology applying any approach to development. On the TDD course, I summarise what we mean by “legacy code” – code that doesn’t have fast-running automated tests, making it risky to change – and briefly demonstrate Michael’s process for changing legacy code safely.

The example I use is a simple Python program for pricing movie rentals based on their IMDB ratings. Average movies rentals cost £3.95. High-rated movies cost an extra pound, low-rated movies cost a pound less.

My program has no automated tests, so I’ve been testing it manually using the command line.

Suppose the business asked us to change the pricing logic; how could we do this safely if we lack automated tests to guard against breaking the code?

Michael’s process goes like this:

  • Identify what code you will need to change
  • Identify where around that code you’d want unit tests
  • Break any dependencies that are stopping you from writing unit tests
  • Write the unit tests you’d want to satisfy you the changes you’ll make didn’t break the code
  • Make the change
  • While you’re there, refactor to improve the code that’s now covered by unit tests to make life easier for the next person who changes it (which could be you)

My Python program has a class called Pricer which we’ll need to change to update the pricing logic.

I’ve been testing this logic one level above by testing the Rental class that uses Pricer.

My script that I’ve been manually testing with allows me to create Rental objects and write their data to the command line for different movies using their IMDB ID’s.

I use three example movies – one with a high rating, one low-rated and one medium-rated – to test the code. For example, the output for the high-rated movie looks like this.

C:\Users\User\Desktop\tdd 2.0\python_legacy>python program.py jgorman tt0096754
Video Rental – customer: jgorman. Video => title: The Abyss, price: £4.95

I’d like to reproduce these manual tests as unit tests, so I’ll be writing unittest tests for the Rental class for each kind of movie.

But before I can do that, there’s an external dependency we have to deal with. The Pricer class connects directly to the OMDB API that provides movie information. I want to stub that so I can provide test IMDB ratings without connecting.

Here’s where we have to get disciplined. I want to refactor the code to make it unit-testable, but it’s risky to do that because… there’s no unit tests! Opinions differ on approach, but personally – learned through bitter experience – I’ve found that it’s still important to re-test the code after every refactoring, manually if need be. It will seem like a drag, but we all tend to overlook how much time we waste downstream fixing avoidable bugs. It will seem slower to manually re-test, but it’s often actually faster in the final reckoning.

Okay, let’s do a refactoring. First, let’s get that external dependency in its own method.

I re-run my manual tests. Still passing. So far, so good.

Next, let’s move that new method into its own class.

And re-test. All passing.

To make the dependency on VideoInfo swappable, the instance needs to be injected into the constructor of Pricer from Rental.

And re-test. All passing.

Next, we need to inject the Pricer into Rental, so we can stub VideoInfo in our planned unit tests.

And re-test. All passing.

Now we can write unit tests to replicate our command line tests.

These unit tests reproduce all the checks I was doing visually at the command line, but they run in a fraction of a second. The going get’s much easier from here.

Now I can make the change to the pricing logic the business requested.

user_story

We can tackle this in a test-driven way now. Let’s update the relevant unit test so that it now fails.

Now let’s make it pass.

(And, yes – obviously in a real product, the change would likely be more complex than this.)

Okay, so we’ve made the change, and we can be confident we haven’t broken the software. We’ve also added some test coverage and dealt with a problematic dependency in our architecture. If we wanted to get movie ratings from somewhere else (e.g., Rotten Tomatoes), or even aggregate sources, it would be quite straightforward now that we’ve cleanly separated that concern from our business logic.

One last thing while we’re here: there’s a couple of things in this code that have been bugging me. Firstly, we’ve been mixing our terminology: the customer says “movie”, but our code says “video”. Let’s make our code speak the customer’s language.

Secondly, I’m not happy with clients accessing objects’ fields directly. Let’s encapsulate.

With our added unit tests, these extra refactorings were much easier to do, and hopefully that means that changing this code in the future will be much easier, too.

Over time, one change at a time, the unit test coverage will build up and the code will get easier to change. Applying this process over weeks, months and years, I’ve seen some horrifically rigid and brittle software products – so expensive and risky to change that the business had stopped asking – be rehabilitated and become going concerns again.

By focusing our efforts on changes our customer wants, we’re less likely to run into a situation where writing unit tests and refactoring gets vetoed by our managers. The results of highly visible “refactoring sprints”, or even long refactoring phases – I’ve known clients freeze requirements for up to a year to “refactor” legacy code – are typically disappointing, and run the risk of making refactoring and adding unit tests forbidden by disgruntled bosses.

One final piece of advice: never, ever discuss this process with non-technical stakeholders. If you’re asked to break down an estimate to change legacy code, resist. My experience has been that it often doesn’t take any longer to make the change safely, and the longer-term benefits are obvious. Don’t give your manager or your customer the opportunity to shoot themselves in the foot by offering up unit tests and refactoring as a line item. Chances are, they’ll say “no, thanks”. And that’s in nobody’s interests.

The 4 Gears of Test-Driven Development

When I explain Test-Driven Development to people who are new to the concept, I try to be clear that TDD is not just about using unit tests to drive design at the internal code level.

Unit tests and the familiar red-green-refactor micro feedback cycle that we most commonly associate with TDD – thanks to 1,001 TDD katas that focus at that level – is actually just the innermost feedback cycle of TDD. There are multiple outer feedback loops that drive the choice of unit tests. Otherwise, how would we know what unit tests we needed to write?

Outside the rapid unit test feedback loop, there’s a slower customer test feedback loop that drives our understanding of what your units need to do in a particular software usage scenario.

Outside the customer test feedback loop, there’s a slower-still feature feedback loop, which may require us to pass multiple customer tests to complete.

And, most important of all, there’s an even slower goal feedback loop that drives our understanding of what features might be required to solve a business problem.

On the Codemanship TDD course, pairs experience these feedback loops first hand. They’re asked to think of a real-world problem they believe might be solved with a simple piece of software. For example, “It’s hard to find good vegan takeaway in my local area.” We’re now in the first feedback loop of TDD – goals.

Then they imagine a headline feature – a proverbial button the user clicks that solves this problem: what would that feature do? Perhaps it displays a list of takeaway restaurants with vegan dishes on their menu that will deliver to my address, ordered by customer ratings. We’re now in the next feedback loop of TDD – features.

Next, we need to think about what other features the software might require to make the headline feature possible. For example, we need to gather details of takeaway restaurants in the area, including their vegan menus and their locations, and whether or not they’ll deliver to the customer’s address. Our headline feature might require a number of such supporting features to make it work.

We work with our customer to design a minimum feature set that we believe will solve their problem. It’s important to keep it as simple as we can, because we want to have a working prototype ready as soon as we’re able that we can test with real end users in the real world.

Next, for each feature – starting with the most important one, which is typically the headline feature – we drive out a precise understanding of exactly what that feature will do using examples harvested from the real world. We might go online, or grab a phone book, and start checking out takeaway restaurants, collecting their menus and asking what postcode areas they deliver in. Then we would pick addresses in our local area, and figure out – for each address – which restaurants would be available according to our criteria. We could search on sites like Google and Trip Advisor for reviews of the restaurants, or – if we can’t find reviews, invent some ratings – so we can describe how the result lists should be ordered.

We capture these examples in a format that’s human readable and machine readable, so we can collaborate directly with the customer on them and also pull the same data into automated executable tests.

We’re now in the customer test feedback loop. Working one customer test at a time, we automate execution of that test so we can continuously check our progress in passing it.

For each customer test, we then test-drive an implementation that will pass the test, using unit tests to drive out the details of how the software will complete each unit of work required. If the happy path for our headline feature requires that we

  • calculate a delivery map location using the customer’s address
  • identify for each restaurant in our list if they will deliver to that location
  • filter the list to exclude the restaurants that don’t
  • order the filtered list by average customer rating

…then that’s a bunch of unit tests we might need to write. We’re now in the unit test feedback loop.

Once we’ve completed our units and seen the customer test pass, we can move on to the next customer test, passing them one at a time until the feature is complete.

Many dev teams make the mistake of thinking that we’re done at this point. This is usually because they have no visibility of the real end goal. We’re rarely invited to participate in that conversation, to be fair. Which is a terrible, terrible mistake.

Once all the features – headline and supporting – are complete, we’re ready to test our minimum solution with real end users. We release our simple software to a representative group of tame vegan takeaway diners, who will attempt to use it to find good food. Heck, we can try using it ourselves, too. I’m all in favour of developers eating their own (vegan) dog food, because there’s no substitute for experiencing it for ourselves.

Our end users may report that some of the restaurants in their search results were actually closed, and that they had to phone many takeaway restaurants to find one open. They may report that when they ordered food, it took over an hour to be delivered to their address because the restaurant had been a little – how shall we say? – optimistic about their reach. They may report that they were specifically interested in a particular kind of cuisine – e.g., Chinese or Indian – and that they had to scroll through pages and pages of results for takeaway that was of no interest to find what they wanted.

We gather this real-world feedback and feed that back into another iteration, where we add and change features so we can test again to see if we’re closer to achieving our goal.

I like to picture these feedback loops as gear wheels. The biggest gear – goals – turns the slowest, and it drives the smaller features gear, which turns faster, driving the smaller and faster customer tests wheel, which drives the smallest and fastest unit tests wheel.

tdd_gears

It’s important to remember that the outermost wheel – goals – drives all the other wheels. They should not turning by themselves. I see many teams where it’s actually the features wheel driving the goals wheel, and teams force their customers to change their goals to fit the features they’re delivering. Bad developers! In your beds!

It’s also very, very important to remember that the goals wheel never stops turning, because there’s actually an even bigger wheel making it turn – the real world – and the real world never stops turning. Things change, and there’ll always be new problems to solve, especially as – when we release software into the world, the world changes.

This is why it’s so very important to keep all our wheels well-oiled so they can keep on turning for as long as we need them to. If there’s too much friction in our delivery processes, the gears will grind to a halt: but the real world will keep on turning whether we like it or not.