Readable Parameterized Tests

Parameterized tests (sometimes called “data-driven tests”) can be a useful technique for removing duplication from test code, as well as potentially buying teams much greater test assurance with surprisingly little extra code.

But they can come at the price of readability. So if we’re going to use them, we need to invest some care in making sure it’s easy to understand what the parameter data means, and to ensure that the messages we get when tests fail are meaningful.

Some testing frameworks make it harder than others, but I’m going to illustrate using some mocha tests in JavaScript.

Consider this test code for a Mars Rover:

it("turns right from N to E", () => {
let rover = {facing: "N"};
rover = go(rover, "R");
assert.equal(rover.facing, "E");
})
it("turns right from E to S", () => {
let rover = {facing: "E"};
rover = go(rover, "R");
assert.equal(rover.facing, "S");
})
it("turns right from S to W", () => {
let rover = {facing: "S"};
rover = go(rover, "R");
assert.equal(rover.facing, "W");
})
it("turns right from W to N", () => {
let rover = {facing: "W"};
rover = go(rover, "R");
assert.equal(rover.facing, "N");
})
view raw rover_test.js hosted with ❤ by GitHub

These four tests are different examples of the same behaviour, and there’s a lot of duplication (I should know – I copied and pasted them myself!)

We can consolidate them into a single parameterised test:

[{input: "N", expected: "E"}, {input: "E", expected: "S"}, {input: "S", expected: "W"},
{input: "W", expected: "N"}].forEach(
function (testCase) {
it("turns right", () => {
let rover = {facing: testCase.input};
rover = go(rover, "R");
assert.equal(rover.facing, testCase.expected);
})
})
view raw rover_test.js hosted with ❤ by GitHub

While we’ve removed a fair amount of duplicate test code, arguably this single parameterized test is harder to follow – both at read-time, and at run-time.

Let’s start with the parameter names. Can we make it more obvious what roles these data items play in the test, instead of just using generic names like “input” and “expected”?

[{startsFacing: "N", endsFacing: "E"}, {startsFacing: "E", endsFacing: "S"}, {startsFacing: "S", endsFacing: "W"},
{startsFacing: "W", endsFacing: "N"}].forEach(
function (testCase) {
it("turns right", () => {
let rover = {facing: testCase.startsFacing};
rover = go(rover, "R");
assert.equal(rover.facing, testCase.endsFacing);
})
})
view raw rover_test.js hosted with ❤ by GitHub

And how about we format the list of test cases so they’re easier to distinguish?

[
{startsFacing: "N", endsFacing: "E"},
{startsFacing: "E", endsFacing: "S"},
{startsFacing: "S", endsFacing: "W"},
{startsFacing: "W", endsFacing: "N"}
].forEach(
function (testCase) {
it("turns right", () => {
let rover = {facing: testCase.startsFacing};
rover = go(rover, "R");
assert.equal(rover.facing, testCase.endsFacing);
})
})
view raw rover_test.js hosted with ❤ by GitHub

And how about we declutter the body of the test a little by destructuring the testCase object?

[
{startsFacing: "N", endsFacing: "E"},
{startsFacing: "E", endsFacing: "S"},
{startsFacing: "S", endsFacing: "W"},
{startsFacing: "W", endsFacing: "N"}
].forEach(
function ({startsFacing, endsFacing}) {
it("turns right", () => {
let rover = {facing: startsFacing};
rover = go(rover, "R");
assert.equal(rover.facing, endsFacing);
})
})
view raw rover_test.js hosted with ❤ by GitHub

Okay, hopefully this is much easier to follow. But what happens when we run these tests?

It’s not at all clear which test case is which. So let’s embed some identifying data inside the test name.

[
{startsFacing: "N", endsFacing: "E"},
{startsFacing: "E", endsFacing: "S"},
{startsFacing: "S", endsFacing: "W"},
{startsFacing: "W", endsFacing: "N"}
].forEach(
function ({startsFacing, endsFacing}) {
it(`turns right from ${startsFacing} to ${endsFacing}`, () => {
let rover = {facing: startsFacing};
rover = go(rover, "R");
assert.equal(rover.facing, endsFacing);
})
})
view raw rover_test.js hosted with ❤ by GitHub

Now when we run the tests, we can easily identify which test case is which.

With a bit of extra care, it’s possible with most unit testing tools – not all, sadly – to have our cake and eat it with readable parameterized tests.

Pull Requests & Defensive Programming – It’s All About Trust

A very common thing I see on development teams is reliance on code reviews for every check-in (in this age of Git-Everything, often referred to as “Pull Requests”). This can create bottlenecks in the delivery process, as our peers are pulled away from their own work and we have to wait for their feedback. And, often, the end result is that these reviews are superficial at best, missing a tonne of problems while still holding up delivery.

Pull Request code reviews on a busy team

But why do we do these reviews in the first place?

I think of it in programming terms. Imagine a web service. It has a number of external clients that send it requests via the Web.

Some clients can be trusted, others not

These client apps were not written by us. We have no control over their code, and therefore can’t guarantee that the requests they’re sending will be valid. There’s a need, therefore, to validate these requests before acting on them.

This is what we call defensive programming, and in these situations where we cannot trust the client to call us right, it’s advisable.

Inside our firewall, our web service calls a microservice. Both services are controlled by us – that is, we’re writing both client and server code in this interaction.

Does the microservice need to validate those requests? Not if we can be trusted to write client code that obeys the contract.

In that case, a more appropriate style of programming might be Design By Contract. Clients are trusted to satisfy the preconditions of service calls before they call them: in short, if it ain’t valid, they don’t call, and the supplier doesn’t need to waste time – and code – checking the requests. That’s the client’s job.

Now let’s project these ideas on to code reviews. If a precondition of merging to the main branch is that your code satisfies certain code quality preconditions – test coverage, naming, simplicity etc – then we have two distinct potential situations:

  • The developer checking in can be trusted not to break those preconditions (e.g., they never check in code that doesn’t have passing tests)
  • The developer can’t be trusted not to break them

In an open source code base, we have a situation where potentially anyone can attempt to contribute. The originators of that code base – e.g., Linus – have no control over who tries to push changes to the main branch. So he defends the code base – somewhat over-enthusiastically, perhaps – from bad inputs like our web service defends our system from bad requests.

In a closed-source situation, where the contributors have been chosen and we can exercise some control over who can attempt to check in changes, a different situation may arise. Theoretically, we hired these developers because we believe we can trust them.

I personally do not check in code that doesn’t have good, fast-running, passing automated tests. I personally do not check in spaghetti code (unless it’s for a workshop on refactoring spaghetti code). If we agree what the standards are for our code, I will endeavour not to break them. I may also use tools to help me keep my code clean pre-check-in. I’m the web service talking to the microservice in that diagram. I’m a trusted client.

But not all developers can be trusted not to break the team’s coding standards. And that’s the problem we need to be addressing here. Ask not so much “How do we do Pull Requests?”, but rather “Why do we need to do Pull Requests?” There are some underlying issues about skills and about trust.

Pull Requests are a symptom, not a solution.

Codemanship Code Craft Videos

Over the last 6 months, I’ve been recording hands-on tutorials about code craft – TDD, design principles, refactoring, CI/CD and more – for the Codemanship YouTube channel.

I’ve recorded the same tutorials in JavaScript, Java, C# and (still being finished) Python.

As well as serving as a back-up for the Codemanship Code Craft training course, these series of videos forms possibly the most comprehensive free learning resource on the practices of code craft available anywhere.

Each series has over 9 hours of video, plus links to example code and other useful resources.

Codemanship Code Craft videos currently available

I’ve heard from individual developers and teams who’ve been using these videos as the basis for their practice road map. What seems to work best is to watch a video, and then straight away try out the ideas on a practical example (e.g., a TDD kata or a small project) to see how they can work on real code.

In the next few weeks, I’ll be announcing Codemanship Code Craft Study Groups, which will bring groups of like-minded learners together online once a week to watch the videos and pair program on carefully designed exercises with coaching from myself.

This will be an alternative way of receiving our popular training, but with more time dedicated to hands-on practice and coaching, and more time between lessons for the ideas to sink in. It should also be significantly less disruptive than taking a whole team out for 3 days for a regular training course, and significantly less exhausting than 3 full days of Zoom meetings! Plus the price per person will be the same as the regular Code Craft course.