Automated Tests Aren’t Just For The Long-Term

Something I hear worryingly often still is teams – especially managers – saying “Oh, we don’t need to automate our tests because there’s only going to be one release.”

The perceived wisdom is that investing in fast-running automated tests is only worth it if the software’s going to have a long lifespan, with many subsequent releases. (This is a sentiment often expressed about code craft in general.)

The assumption is that fast-running unit tests have less – or zero – value in the short-to-medium term. But this is easily disproved.

Ask ourselves what we need fast-running tests for in the first place? To guard against regressions when we change the code. The inexperienced team or manager might argue that “we won’t be changing the code, because there’s only going to to be one release”.

Analysis by GitLab’s data sciences team clearly shows that code churn – when classified as code that changes within 2-3 weeks of being checked in – for the average team runs at about 25%. An average team of, say, four developers might check in 10,000 LOC on a 12-week release schedule. 2,500 lines of that code will change within 2-3 weeks. That’s a lot of changes.

And that’s normal. Expect it.

This is before we take into account the many changes a programmer will make to code before they check it in. If only tested my code when it was time to check it in, I think I’d really struggle.

It’s a question of batch size. If I make one change and then re-test, and I’ve broken something, it’s much, much easier to pinpoint what’s gone wrong. And it’s way, way easier to get back to code that works. If I make 100 changes and re-test, I’m probably going to end up knee-deep in the debugger and up to me neck in print statements, and reverting to the last working copy means losing a tonne of work.

So I test pretty much continuously, and find even on relatively small projects that my hide gets saved multiple times by having these tests.

Change is much easier with fast-running tests, and change is a normal part of delivery.

And then there’s the whole question of whether it really will be the only release of the software. Experience has taught me that if software gets used, it gets changed. The only one-shot deals I’ve experienced in harumpty-twelve years of writing software have been the unsuccessful ones.

Imagine we’re asked to dig out an underground shelter for our customer. They tell us they need a chamber 8 ft x 8 ft x 6 ft – big enough for a bed – and we dutifully start digging. Usually, we would put up wooden supports as we dig, to stop the chamber from caving in. “No need”, says the customer. “It’s only one room, and we’ll only use it once.”

So, we don’t put in any supports. And that makes completing the chamber harder, because it keeps caving in due to the vibrations of our ongoing excavations. For every cubic metre of dirt we excavate, we end up digging out another half a cubic metre from the cave-ins. But we get there in the end, and the customer pays us our money and moves their bed in.

Next week, we get a phone call. “Where do we keep our food supplies?” Turns out, they’ll need another room. Would they like us to put supports up in the main chamber before we start digging again? “No time! We need our food store ASAP.” Okey dokey. We start digging gain, and the existing chamber starts caving in again, but we dig out the loose earth and carry on as best we can. We manage to get the food store done, but with a lot more work this time, because both spaces keep caving in, and we keep having to dig them out again and again, recreating spaces we’d already excavated several times.

The customer moves in their food supplies, but their elderly mother now refuses to go into the shelter because she’s not sure it’s safe.

A week later: “Oh hi. Er. Where do we go to the bathroom?” Work begins on a third chamber. Would they like us to put supports in to the other two chambers first? “No. Need a bathroom ASAP!!!” they exclaim with a rather pained expression. So we dig and dig and dig, now so tired that we barely notice that most of the space we’re excavating has been excavated before, and most of the earth we’re removing has been coming from the ceilings of the existing chambers as well as from the new bathroom.

This is what it is to work without fast-running tests. Even on small, one-shot deals of just a few days, regressions can become a major expense, quickly outweighing the cost of writing tests in the first place.

How to Beat Evil FizzBuzz

On the last day of the 3-day Codemanship TDD training workshop, participants are asked to work as a team to solve what would – for an individual developer – be a very simple exercise.

The FizzBuzz TDD kata is well known, and a staple in many coding interviews these days. Write a program that outputs the numbers 1…100 as a single comma-delimited string. Any numbers that are divisible by 3, replace with ‘Fizz’. Any numbers that are divisible by 5, replace with ‘Buzz’. And any numbers that are divisible by 3 and 5, replace with ‘FizzBuzz’. Simples.

An individual can usually complete this in less than half an hour. But what if we make it evil?

Splitting the problem up into five parts, and then assigning each part to a pair or individual in the group, who can only work on the code for their part.

  • Generate a list of integers from 1 to 100
  • Replace integers divisible by 3 with ‘Fizz’
  • Replace integers divisible by 5 with ‘Buzz’
  • Replace integers divisible by 3 and 5 with ‘FizzBuzz’
  • Output the resulting list as a comma-delimited string

Working as a single team to produce a single program that passes my customer test – seeing the final string with all the numbers, Fizzes, Buzzes and FizzBuzzes in the right places produced by their program run on my computer – the group has to coordinate closely to produce a working solution. They have one hour, and no more check ins are allowed after their time’s up. They must demonstrate whatever they’ve got in the master branch of their GitHub repository at the end of 60 minutes.

This is – on the surface of it – an exercise in Continuous Integration. They need to create a shared repository, and each work on their own copy, pushing directly to the master branch. (This is often referred to as trunk-based development.) They must set up a CI server that runs a build – including automated tests – whenever changes are pushed.

Very importantly, once the CI server is up and running, and they’ve got their first green build, the build must never go red again. (Typically it takes a few tries to get a build up and running, so they often start red.)

Beyond those rules:

  • Produce a single program that passes the customer’s test on the customer’s machine
  • Only write code for the part they’ve been assigned
  • Push directly to master on a single GitHub repository – no branching
  • CI must run a full build – including tests – on every push
  • Must not break the build once it’s gone green for the first time
  • Last push must happen before the end of the hour

They can do whatever they need to. It’s their choice of programming language, application type (console, web app, desktop app etc) and so on. They choose which CI solution to use.

90% of groups who attempt Evil FizzBuzz fail to complete it within the hour. The three most common reasons they fail are:

  1. Too long shaving yaks – many groups don’t get their CI up and running until about 30-40 minutes in. In some cases, they never get it up and running.
  2. Lack of a bigger picture – many groups fail to establish a shared vision for how their program will work, and – importantly – how the pieces will fit together
  3. Integrating too late – with cloud-based CI, the whole process of checking your code in can take 2-3 minutes minimum. Times that by 5, and groups often discover that everyone deciding to push their changes with just fives minutes to go means their ship has sailed without them.

On the first point, it’s important to have a game plan and to keep things simple. I can illustrate using a Node and JavaScript example.

First, one of the pairs needs to create a skeleton Node project, with a dummy test for the build server to run. We need to get our delivery pipeline up and running quickly, before anyone even thinks about writing any solution code.

skeleton_node_project

This is just an empty Node project, with a single dummy Mocha unit test. Make sure the test passes, then create a GitHub repository and push this skeleton project to it.

initial_commit

Now, let’s set up a CI server. I’m going to use circleci.com. Logging in with my GitHub account, I can easily see and add a build project for my new evil_fizzbuzz repository.

add_circleci_project

It helps enormously to go with the popular conventions for your project. I’m using Node, which is widely supported, Mocha for tests which are named and located where – by default – the build tool would expect to find them, and it’s all very Yarn-friendly. Well, maybe. We’ll see. I add a .circleci/config.yml file to my project and paste in the default settings recommended for my project by CircleCI.

circleci_config

Then I push this new file to master, and instruct CircleCI to start a build. This first build fails. They usually do. Looking at the output, the part of the workflow where it fell over has the error message:

The engine "node" is incompatible with this module. Expected version "6.* || 8.* || >= 10.*"

I’m not proud. Don’t sit there trying to figure things like this out. Just Google the error message and see if anyone has a fix for it. Turns out it’s common, and there’s a simple fix you can do in the config.yml file. So I fix it, push that change, and wait for a second build.

green_build

The build succeeds, but I need to make sure the test was actually run before we can continue.

tests_ran

Looks like we’re in business. Time to start working on our solution.

Next, you’ll need to invite all your team mates to contribute to your GitHub project. This is where team skills help: someone needs to get all the necessary user IDs, make sure everyone is aware that invites are being sent out, and ensure everyone accepts their invite ASAP. Coordination!

While this is going on, someone should be thinking about how the finished program will be demonstrated on the customer’s laptop. Do they have a compatible version of Node.js installed already? And how will they resolve dependencies – in this case, Mocha?

Effective software design begins and ends with the user experience. The pair responsible for the final output should take care of this, I think.

Time to complete our end-to-end “Hello, world!” so our delivery pipeline joins all the dots.

The output pair add a JavaScript file that will act as the entry point for the program, and have it write “Hello, world!” to the console.

hello_world

After checking program.js works on the local command line, push it to master.

We establish that our customer – me, in this case – happens to have Git and Node.js installed, so possibly the simplest way to demonstrate the program running on my computer might be to clone the files from master into a local folder, run npm install to resolve the Mocha dependency, and then we can just run node program.js in our customer demo. (We can tidy that up later if need be, but it will pass the test.)

rmdir teamjason /s /q
mkdir teamjason
cd teamjason
git clone https://github.com/jasongorman/evil_fizzbuzz.git
cd evil_fizzbuzz
npm install

We test that it works on the customer’s laptop, and now we’re finally ready to start implementing our FizzBuzz solution.

Phew. Yaks shaved.

But where to start?

This is the second place a lot of teams go wrong. They split off into their own pairs, clone the GitHub repository, and start working on their part of the solution straight away with no overall understanding of how it will all fit together to solve the problem.

This is where mob programming can help. Before splitting off, get everyone around one computer (there’s always a projector or huge TV in the room they can use). The pair responsible for writing the final output write the code (which satisfies the rules), while the rest of the group give input on the top-level design. In simpler terms, the team works outside-in, to identify what parts will be needed and see how their part fits in.

In my illustration, I’m thinking maybe a bit if functional composition might be the way to go.

This is the only code the pair who are responsible for outputting the result are allowed to write, according to the rules of Evil FizzBuzz. But the functions used here don’t exist, so we can’t push this to master without breaking the build.

Here’s where we get creative. Each of the other four pairs takes their turn at the keyboard to declare their function – just an empty one for now.

We can run this and see that it is well-formed, and produces an empty output, as we’d expect at this point. Let’s push it to master.

It’s vital for everyone to keep one eye on the build status, as it’s a signal – a pulse, if you like – every developer on a team needs to be aware of. This build succeeds.

builds

So, we have an end-to-end delivery pipeline, and we have a high-level design, so everyone can see how their part fits into the end solution.

This can be where pairs split off to implement their part. Now is the time to make clones and here’s where the CI skills come into play.

Let’s say one pair is working on the Fizz part. They take a clone of master, and – because it is a TDD course, after all – write and pass their first Mocha test.

On a green light, it’s time maybe for a bit of refactoring. The pair decide to pull the fizz function into it’s own file, to keep what they’re doing more separate from everyone else.

Having refactored the structure of the solution a little, they feel this might be a good time to share those changes with the rest of the team. This helps avoid the third mistake teams make – integrating too late, with too many potentially conflicting changes. (Many Evil FizzBuzz attempts end with about 15 minutes of merge hell.) Typically this ends with them breaking the build and the team disqualified.

But before pushing to master, they run all of the tests, just to be sure.

fizz_test

With all tests passing, it should be safe to push. Then they wait for a green build before moving on to the next test case.

build_in_progress

While builds are in progress, other members of the team must be mindful that it’s not safe to push their changes until the whole process has completed successfully. They must also ensure they don’t pull changes that break the build, so everyone should be keeping one eye on the build status.

Phew. It’s green.

When you see someone else’s build succeed, that would be a good time to consider pulling any changes that have been made, and running all of the tests locally. Keeping in step with master, when working in such close proximity code-wise, is very important.

Each pair continues in this vein: pass a test, maybe do some refactoring, check in those changes, wait for a green build, pull any changes other pairs have made when you see their builds go green, and keep running those tests!

It’s also a very good idea to keep revisiting the customer test to see what visible progress is being made, and to spot any integration problems as early as possible. Does the high-level design actually work? Is each function playing its part?

Let’s pay another visit to the team after some real progress has been made. When we run the customer test program, what output do we get now?

command_line_inprogress

Okay, it looks like we’re getting somewhere now. The list of 100 numbers is being generated, and every third number is Fizz. Work is in progress on Buzz and FizzBuzz. if we were 45 minutes in at this point, we’d be in with a shot at beating Evil FizzBuzz.

Very quickly, the other two pieces pieces of our jigsaw slot into place. First, the Buzzes…

command_line_inprogress_buzz

And finally the FizzBuzzes.

command_line_complete

At this point, we’re pretty much ready for our real customer test. We shaved the yaks, we established an overall design, we test-drove the individual parts and are good to go.

So this is how – in my experience – you beat Evil FizzBuzz.

  1. Shave those yaks first! You need to pull together a complete delivery pipeline, that includes getting it on to the customer’s machine and ready to demo, as soon as you can. The key is to keep things simple and to stick to standards and conventions for the technology you’ve chosen. It helps enormously, of course, if you have a good amount of experience with these tools. If you don’t, I recommend working on that before attempting Evil FizzBuzz. “DevOps” is all the rage, but surprisingly few developers actually get much practice at it. Very importantly, if your delivery pipeline isn’t up and running, the whole delivery machine is blocked. Unshaved yaks are everybody’s problem. Don’t have one pair “doing the build” while the rest of you go away and work on code. How’s your code going to get into the finished solution and on to the customer’s machine?
  2. Get the bigger picture and keep it in sight the whole time. Whether it’s through mob programming, sketching on a whiteboard or whatever – involve the whole team and nail that birds-eye view before you split off. And, crucially, keep revisiting your final customer test. Lack of visibility of the end product is something teams working on real products and projects cite a major barrier to getting the right thing done. Invisible progress often turns out to be no progress at all. As ‘details people’, we tend to be bad at the bigger picture. Work on getting better at it.
  3. Integrate early and often. You might only have unit 3 tests to pass for your part in a one-hour exercise, but that’s 3 opportunities to test and share your changes with the rest of the team. And the other side of that coin – pull whenever you see someone else’s build succeed, and test their changes on your desktop straight away. 5 pairs trying to merge a bunch of changes in the last 15 minutes often becomes a train wreck. Frequent, small merges work much better on average.

 

 

 

 

 

Action(Object), Object.Action() and Encapsulation

Just a quick post to bookmark an interesting discussion happening in Twitter right now in response to a little tweet I sent out.

Lot’s of different takes on this, but they tend to fall into three rough camps:

  • Lots of developers prefer action(object) because it reads the way we understand it – buy(cd), kick(ball) etc. Although, of course, this would imply functional programming (or static methods of unnamed classes)
  • Some like a subject, too – customer.buy(cd), player.kick(ball)
  • Some prefer the classic OOP – ball.kick(), cd.buy()

More than a few invented new requirements, I noticed. A discussion about YAGNI is for another time, though, I think.

Now, the problem with attaching the behaviour to a subject (or a function or static method of a different module or class) is you can end up with Feature Envy.

Let’s just say, for the sake of argument, that kicking a ball changes it’s position along an X-Y vector:

class Player(object):
    @staticmethod
    def kick(ball, vector):
        ball.x = ball.x + vector.x
        ball.y = ball.y + vector.y


class Ball(object):
    def __init__(self):
        self.x = 0
        self.y = 0


class Vector(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y


if __name__ == "__main__":
    ball = Ball()
    Player.kick(ball, Vector(5,5))
    print("Ball -> x =", ball.x, ", y =", ball.y)

Player.kick() has Feature Envy for the fields of Ball. Separating agency from data, I’ve observed tends to lead to data classes – classes that are just made of fields (or getters and setters for fields, which is just as bad from a coupling point of view) – and lots of low-level coupling at the other end of the relationship.

If I eliminate the Feature Envy, I end up with:

class Player(object):
    @staticmethod
    def kick(ball, vector):
        ball.kick(vector)


class Ball(object):
    def __init__(self):
        self.x = 0
        self.y = 0

    def kick(self, vector):
        self.x = self.x + vector.x
        self.y = self.y + vector.y

And in this example – if we don’t invent any extra requirements – we don’t necessarily need Player at all. YAGNI.

class Ball(object):
    def __init__(self):
        self.x = 0
        self.y = 0

    def kick(self, vector):
        self.x = self.x + vector.x
        self.y = self.y + vector.y


class Vector(object):
    def __init__(self, x, y):
        self.x = x
        self.y = y


if __name__ == "__main__":
    ball = Ball()
    ball.kick(Vector(5,5))
    print("Ball -> x =", ball.x, ", y =", ball.y)

So we reduce coupling and simplify the design – no need for a subject, just an object. The price we pay – the trade-off, if you like – is that some developers find ball.kick() counter-intuitive.

It’s a can of worms!

Code Craft : Part III – Unit Tests are an Early Warning System for Programmers

Before I was introduced to code craft, my way of checking that the programs I wrote worked was to run them and use them and see if they did what I expected them to do.

Consider this command line program I wrote that does some simple maths:

I can run this program with different inputs to check if the results of the calculations are correct.

C:\Users\User\PycharmProjects\pymaths>python maths.py sqrt 4.0
The square root of 4.0 = 2.0

C:\Users\User\PycharmProjects\pymaths>python maths.py factorial 5
5 factorial = 120

C:\Users\User\PycharmProjects\pymaths>python maths.py floor 4.7
The floor of 4.7 = 4.0

C:\Users\User\PycharmProjects\pymaths>python maths.py ceiling 2.3
The ceiling of 2.3 = 3.0

Testing my code by using the program is fine if I want to check that it works first time around.

These four test cases, though, don’t give me a lot of confidence that the code really works for all the inputs my program has to handle. I’d want to cover more examples, perhaps using a list to remind me what tests I should do.

  • sqrt 0.0 = 0.0
  • sqrt -1.0 -> should raise an exception
  • sqrt 1.0 = 1.0
  • sqrt 4.0 = 2.0
  • sqrt 6.25 = 2.5
  • factorial 0 = 0
  • factorial 1 = 1
  • factorial 5 = 120
  • factorial – 1 -> should raise an exception
  • factorial 0.5 -> should raise an exception
  • floor 0.0 = 0.0
  • floor 4.7 = 4.0
  • floor -4.7 = -5.0
  • ceiling 0.0 = 0.0
  • ceiling 2.3 = 3.0
  • ceiling -2.3 = -2.0

Now, that’s a lot of test cases (and we haven’t even thought about how we handle incorrect command line arguments yet).

To run the program and try all of these test cases once seems like quite a bit of work, but if it’s got to be done, it’s got to be done. (The alternative is not doing all these tests, and then how do we know our program really works?)

But what if I need to change my maths code? (And if we know one thing about code, it’s that it changes). Then I’ll need to perform these tests again. And if I change the code again, I have to do the tests again. And again. And again. And again.

If we don’t re-test the code after we’ve changed it, we risk not knowing if we’ve broken it. I don’t know about you, but I’m not happy with the idea of my end users being lumbered with broken software. So I re-test the software every time it changes.

It took me about 5-6 minutes to perform all of these tests using the command line. That’s 5-6 minutes of testing every time I need to change my code. And maybe 5-6 minutes of testing doesn’t sound like a lot, but this program only has about 40 lines of code. Extrapolate that testing time to 1,000 lines of code. Or 10,000 lines. Or a million.

Testing programs by using them – what we call manual testing – simply doesn’t scale up to large amounts of code. The time it takes to re-test our program when we’ve changed the code becomes an obstacle to making those changes safely. If it takes hours or days or even weeks to re-test it, then change will be slow and difficult. It may even be impractical to change it at all, and far too many programs lots of people rely on end up in this situation. The time taken to test our code has a profound impact on the cost of making changes.

Studies have shown that the effort required to fix a bug rises dramatically the longer that bug goes undiscovered.

Cost-of-Correcting-Defects-Boehm-and-Basili

If it takes a week to re-test our program, then the cost of fixing the bugs that testing discovers will be much higher than if we’d been alerted a minute after we made that error. The average programmer can introduce a lot of bugs in a week.

Creating good working software depends heavily on our ability to check that the code’s working very frequently – almost continuously, in fact. So we have to be able to perform our tests very, very quickly. And that’s not possible when we perform them manually.

So, how could we speed up testing to make changes quicker and easier? Well we’re computer programmers – so how about we write a computer program to test our code?

A few things to note about my test code:

  • Each test case has a unique name to make it easy to identify which test failed
  • There are two helper functions that ask if the actual result matches the expected result – either an expected output, or an expected exception that should have been raised
  • The script counts the total number of tests run and the number of tests passed, so it can summarise the result of running this suite of tests
  • My test code isn’t testing the whole program from the outside, like I was doing at the command line. Some code just tests the sqrt function, some just tests the factorial function, and so on. Tests that only test parts of a program are often referred to as unit tests. A ‘unit’ could be an individual function or a method of a class, or a whole class or module, or a group of these things working together to do a specific job. Opinions vary, but what we mostly all agree is that a unit is a discrete part of a program, and not the whole program.

The advantages of testing units instead of whole programs are important:

  1. When a test fails, it’s much easier to pinpoint the source of the problem
  2. Less code is executed in order to check a specific piece of logic works, so unit tests tend to run much faster
  3. By invoking functions directly, there’s usually less code involved in writing a unit test

When I run my test script, if all the tests pass, I get this output:

Running math tests…
Tests run: 16
Passed: 16 , Failed: 0

Phew! All my tests are passing.

This suite of tests ran in a fraction of a second, meaning I can run them as many times as I like, as often as I want. I can change a single line of code, then run my tests to check that change didn’t break anything. If I make a boo-boo, there’s a high chance my tests will alert me straight away. We say that these automated tests give me high assurance that – at any point in time – my code is working.

This ability to re-test our code after just a single change can make a huge difference to how we program. If I break the code, very little has little has changed since the code was last working, so it’s much easier to pinpoint what’s gone wrong and much easier to fix it. If I’ve made 100 changes before I re-test the code, it could be a lot of work to figure out which change(s) caused the problem. I have found, after 25 years of writing unit tests, that I need to spend very little time in my debugger.

If any tests fail, I get this kind of output:

Running math tests…
sqrt of 0.0 failed – expected 1.0 , actual 0
sqrt of -1.0 failed – expected Exception to be raised
Tests run: 16
Passed: 14 , Failed: 2

It helpfully tells me which tests failed, and what the expected and actual results were, to make it easier for me to pin down the cause of the problem. Since I only made a small change to the code since the tests last all passed, it’s easy for me to fix.

Notice that I’ve grouped my tests by the function that they’re testing. There’s a bunch of tests for the sqrt function, a bunch for factorial, and more for floor and for ceiling. As my maths program grows, I’ll add many more tests. Keeping them all in one big module will get unmanageable, so it makes sense to split them out into their own modules. That makes them easier to manage, and also allows us to run just the tests for, say, sqrt, or just the tests for factorial – if we only changed code in those parts of the program – if we want to.

Here I’ve split the tests for sqrt into their own test module, which we call a test fixture. It can be run by itself, or can be invoked as part of the main test suite along with the other test fixtures.

The two helper functions I wrote that check and record the result of each test – assert_equals and assert_raises – could be reused in other suites of tests, since they’re quite generic. What I’ve created here could be the beginnings of a reusable library for writing test scripts in Python.

As my maths program grows, and I add more and more tests, there’ll likely be more helper functions I’ll find useful. But, in computing, before you set out to write a reusable library to help you with something, it’s usually a good idea to check if someone’s already written one.

For a problem as common as automating program tests, you won’t be surprised that such libraries already exist. Python has several, but the most commonly used test automation library actually comes as part of Python’s standard modules – unittest (formerly known as PyUnit.)

Here’s the sqrt tests I write translated into unittest tests.

There’s a lot to unittest, but this test fixture uses just some of its basic features.

To create a test fixture, you just need to declare a class that inherits from unittest.TestCase. Individual tests are methods of your fixture class that start with test_ – so that unittest knows it’s a test – and they accept no parameters, and return no data.

The TestCase class defines many useful helper methods for making assertions about the result of a test. Here, I’ve used assertEqual and assertRaisesRegex.

assertEqual takes an expected result value as the first parameter, followed by the actual result, and compares the two. If they don’t match, the test fails.

assertRaisesRegex is like my own assert_raises, except that it also matches the error message the exception is raised with using regular expressions – so we can check that it was the exact exception we expected.

I don’t need to write a test suite that directly invokes this test fixture’s tests. The unittest test runner will examine the test code, find the test fixtures and test methods, and build the suite out of all the tests it finds. This saves me a fair amount of coding.

I can run the sqrt tests from the command line:

C:\Users\User\PycharmProjects\pymaths\test>python -m unittest sqrt_test.py
…..
———————————————————————-
Ran 5 tests in 0.002s

OK

If any tests fail, unittest will tell me which tests failed and provide helpful diagnostic information.

C:\Users\User\PycharmProjects\pymaths\test>python -m unittest sqrt_test.py
F…F
======================================================================
FAIL: test_sqrt_0 (sqrt_test.SqrtTest)
———————————————————————-
Traceback (most recent call last):
File “C:\Users\User\PycharmProjects\pymaths\test\sqrt_test.py”, line 8, in test_sqrt_0
self.assertEqual(1.0, sqrt(0.0))
AssertionError: 1.0 != 0

======================================================================
FAIL: test_sqrt_minus1 (sqrt_test.SqrtTest)
———————————————————————-
Traceback (most recent call last):
File “C:\Users\User\PycharmProjects\pymaths\test\sqrt_test.py”, line 13, in test_sqrt_minus1
lambda: sqrt(1))
AssertionError: Exception not raised by <lambda>

———————————————————————-
Ran 5 tests in 0.002s

FAILED (failures=2)

I can run all of the tests in my project folder at the command line using unittest‘s test discovery feature.

C:\Users\User\PycharmProjects\pymaths\test>python -m unittest discover -p “*_test.py”
…………….
———————————————————————-
Ran 16 tests in 0.004s

OK

The test runner finds all tests in files matching ‘*_test.py’ in the current folder and runs them for me. Easy as peas!

You may have noticed that my tests are in a subfolder C:\Users\User\PycharmProjects\pymaths\test, too. It’s a very good idea to keep your test code separate from the code they’re testing, so you can easily see which is which.

Note how each test method has a meaningful name that identifies the test case, just like the test names in my hand-rolled unit tests before.

Note also that each test only asks one question – Is the sqrt of four 2? Is the factorial of five 120? And so on. When a test fails, it can only really be for one reason, which makes debugging much, much easier.

When I’m programming, I put in significant effort to make sure that as much of my code is tested by automated unit tests as possible. And, yes, this means I may well end up writing as much unit test code as solution code – if not more.

A common objection inexperienced programmers have to unit testing is that they have to write twice as much code. Surely this takes twice as long? Surely we could add twice as many features if we didn’t waste time writing unit test code?

Well, here’s the funny thing: as our program grows, we tend to find – if we rely on slow manual testing to catch the bugs we’ve introduced – that the proportion of the time we spend fixing bugs grows too. Teams who do testing the hard way often end up spending most of their time bug fixing.

timespent

Because bugs can cost exponentially more to fix the longer they go undiscovered, we find that the effort we put in up-front to write fast tests that will catch them more than pays for itself later on in time saved.

Sure, if the program you’re writing is only ever going to be 100 lines long, extensive unit tests might be a waste (although I would still write a few, as I’ve found even on relatively simple programs some unit testing has saved me time). But most programs are much larger, and therefore unit tests are a good idea most of the time. You wouldn’t fit a smoke alarm in a tiny Lego house, but in a real house that people live in, you might be very grateful of one.

One final thought about unit tests. Consider this code that calculates rental prices of movies based on their IMDb ratings:

This code fetches information about a video, using its IMDb ID, from a web service. Using that information, it decides whether to charge a premium of £1 because the video has a high IMDb rating or knock off £1 because the video has a low IMDb rating.

If we wrote a unittest test for this, when it runs our code will connect to an external web service to fetch information about the video we’re pricing. Connecting to web services is slow in comparison to things that happen entirely in memory. But we want our unit tests to run as fast as possible.

How could we test that prices are calculated correctly without connecting to this external service?

Our pricing logic requires movie information that comes from someone else’s software. Could we fake that somehow, so a rating is available for us to test with?

What if, instead of the price method connecting directly to the web service itself, we were to provide it with an object that fetches video information for it? i.e., what if we made fetching video information somebody else’s problem? The object is passed in as a parameter of Pricer‘s constructor like this.

Because videoInfo is passed as a constructor parameter, Pricer only knows what that object looks like from the outside. It knows it has to have a fetch_video_info method that accepts an IMDb ID as a parameter and returns the title and IMDb rating of that video.

Thanks to Python’s duck typing – if it walks like a duck and quacks like a duck etc – any object that has a matching method should work inside Pricer, including one that doesn’t actually connect to the web service.

We could write a class that provides whatever title and IMDb rating we tell it to, and use that in a unit test for Pricer.

When I run this test, it checks the pricing logic just as thoroughly as if we’d fetched the video information from the real web service. How video titles and ratings are obtained has nothing to do with how rental prices are calculated. We achieved flexibility in our design by cleanly separating those concerns. (Separation of Concerns is fancy software architecture-speak for “make it someone else’s problem”.)

The object that fetches video information is passed in to the Pricer. We call this dependency injection. Pricer depends on VideoInfo, but because the dependency is passed in as a parameter from the outside, the calling code can decide which implementation to use – the stub, or the real thing.

A stub is a kind of what we call a test double. It’s an object that looks like the real thing from the outside, but has a different implementation inside. The job of a stub is to provide test data that would normally come from some external source – like video titles and IMDb ratings.

Test doubles require us to introduce flexibility into our code, so that objects (or functions) can use each other without knowing exactly which implementation they’re using – just as long as they look the same as the real thing from the outside. This not only helps us to write fast-running unit tests, but is good design generally. What if we need to fetch video information from a different web service? Because we provide video information by dependency injection, we can easily swap in a different web service with no need to rewrite Pricer.

This is what we really mean by ‘separation of concerns’ – we can change one part of the program without having to change any of the other parts. This can make changing code much, much easier.

Let’s look at one final example that involves an external dependency. Consider this code that totals the number of copies of a song sold on a digital download service, then sends that total to a web service that compiles song charts at the end of each day.

How can we unit test that song sales are calculated correctly without connecting to the external web service? Again, the trick here it to separate those two concerns – to make sending sales information to the charts somebody else’s problem.

Before we write a unit test for this, notice how this situation is different to the video pricing example. Here, our charts object doesn’t return any data. So we can’t use a stub in this case.

When we want to swap in a test double for an object that’s going to be used, but doesn’t return any data that we need to worry about, we can choose from two other kinds of test double.

A dummy is an object that looks like the real thing from the outside, but does nothing inside.

In this test, we don’t care if the sales total for the song is sent to the charts. It’s all about calculating that total.

But what if we do care if the total is sent to the charts once it’s been calculated? How could we write a test that will fail if charts.send isn’t invoked?

A mock object is a test double that remembers when its methods are called so we can test that call happened. Using the built-in features of the unittest.mock library, we can create a mock charts object and verify that send is invoked with the exact parameter values we want.

In this test, we create an instance of the real Charts class that connects to the web service, but we replace its send method with a MagicMock that records when it’s invoked. We can then assert at the end that when sales_of is executed, charts.send is called with the correct song and sales total.

 

So there you have it. Unit tests – tests that test part of our program, and execute without connecting to any external resources like web services, file systems, databases and so on – are fast-running tests that allow us to test and re-test our program very frequently, ensuring as much as possible that our code’s always working.

As you’ll see in later posts, good, fast-running unit tests are an essential foundation of code craft, enabling many of the techniques we’ll be covering next.

 

 

Code Craft : Part II – Version Control is Seat Belts for Programmers

When I was starting out as a professional programmer, I took the basic precaution of occasionally backing up my code so that if I took a “wrong turn”, I could get back to something that kind of sort of worked. I used to do this the old-fashioned way of creating a daily folder and copying the code into it.

But, it turns out, a day is a lot of work to lose. When things did go wrong – which happened regularly – I’d only go back to the previous day’s code as a last resort. Usually, I’d try and fix the problem, which took up a lot of time and typically had disappointing results.

Also, my hard drive very quickly filled up with back-ups if I didn’t get into the habit of deleting older copies. Maybe I changed 5 lines of code that day; making an entire copy of 500,000 lines of code every time is pretty wasteful. And if I made back-ups more often, the drive would fill up faster. In the 1990s, disk space was still expensive.

The effect of making infrequent back-ups on the way I worked was quite profound. When you risk losing a day’s work when you try something new, you take less risks. Fear tends to stifle creativity and innovation.

Really, I should have been making back-ups far more frequently – at least every hour or so – and the only way for that to be practical on a PC with a 100MB hard drive is to not back-up all the source code every single time, but only the parts that have changed.

I was several days into attempting to write something that enabled this when a more experienced programmer told me that such tools already existed. (That happens a lot in computing.)

His team were using what he called a “version control system” or VCS – in this case a tool called CVS (Concurrent Versions System). CVS was relatively new at the time (it was first released in 1990), but I later learned that version control systems had been around since the early 1970s.

A code project was copied to a central repository for the team to access, and they could “check in” any changes they made to source code files, and CVS stored the changes as a “delta”, keeping a history of all revisions to every file in the repository. Using the original source files and the deltas, CVS could recreate any version of the code from any point in its history.

I very quickly realised that this was super-useful. Not only could you get back to any version of the code with ease, without filling your hard drive up with copies, but you could also see the entire history of the code and analyse how it has evolved. Think of a version history as being a bit like a computer program’s own personal diary, logging every interesting change that’s been made – potentially going back years. Much can be learned by reading diaries.

I’ve been using version control systems ever since. And over the next 25 years, they have become very widespread. Most professional programmers use version control these days. So it’s curious – and a little alarming – that many schools and universities don’t teach students how to use them (or even tell students they exist).

The most popular VCS in use today is Git. Git is what we call a distributed version control system (DVCS). As well as a central repository of source code files, it also allows programmers to keep their own local repository, into which they can track changes they make on their own computer, before “pushing” those changes to the central repository to share with the other programmers on the team.

A simple workflow for version control with Git might go something like this (using the Git command line program in Bash):

  • Initialise a folder on your computer to be a local Git code repository

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths
$ git init
Initialized empty Git repository in C:/python_projects/maths/.git/

  • In my maths folder, I create a Python script called sqrt.py. 
  • If I want this file to be version-controlled, I need to add it to the Git repository.

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths (master)
$ git add sqrt.py

  • sqrt.py is put into a “staging area” that contains all of the file changes (files added, files modified, files deleted) for my first commit to my local Git repository. Let’s commit this with a meaningful message that helps identify what version of the code this is.

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths (master)
$ git commit -m “This is my first commit”
[master (root-commit) 3e39188] This is my first commit
1 file changed, 14 insertions(+)
create mode 100644 sqrt.py

  • If I make a change to sqrt.py
  • …and then commit that change…

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths (master)
$ git commit -m ‘Changed input to be square rooted’ –all
[master 75b5aef] Changed input to be square rooted
1 file changed, 1 deletion(-)

  • …we add a new version of the source file to our local repository. We can see the version history of our repository using Git’s log command.

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths (master)
$ git log

commit f113f51030eab07943b9e8f9493d17a2209544d2

Author: Jason Gorman <jason.gorman@codemanship.com>
Date: Wed Oct 2 08:39:12 2019 +0100

Changed input to be square rooted

commit 3e391889c26574357b35f687413f2eb5d9e4f2c1
Author: Jason Gorman <jason.gorman@codemanship.com>
Date: Wed Oct 2 08:17:51 2019 +0100

This is my first commit

  • If I then make a boo-boo in this code…
  • …I can get back to either of those versions by using Git’s reset command.

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths (master)
$ git reset –hard f113f51030eab07943b9e8f9493d17a2209544d2
HEAD is now at f113f51 Changed input to be square rooted

  • And I can go back to any version in the code’s history if I want. I just tell it which version – the long identifier Git assigns to each commit – I want to go back to. Ultimate undo-ability!

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths (master)
$ git reset –hard 3e391889c26574357b35f687413f2eb5d9e4f2c1
HEAD is now at 3e39188 This is my first commit

Remember that Git is what we call a distributed version control system (DVCS), so the version history of my sqrt.py file is stored in a local repository on my computer. I can also create a shared remote repository – for example, on github.com – so other programmers can access the files and their histories and contribute to my maths project.

  • First, I create a new repository using my GitHub account. I’ve called it pymaths.

new_github_repo

  • Then I copy the remote repository’s unique URL

github_repo_url

  • I can now add this remote repository for use with my local repository

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths (master)
$ git remote add origin https://github.com/jasongorman/pymaths.git

  • Now I can push the commits I made to my local repository to the remote repository, where other programmers can access them.

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths (master)
$ git push origin master
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Delta compression using up to 4 threads
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 365 bytes | 365.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://github.com/jasongorman/pymaths.git
* [new branch] master -> master

Now we can see that our commits are showing in the pymaths GitHub repository. (Bear in mind that I reset the code back to the original commit, so that’s the one showing as current.)

pymaths

When multiple programmers are contributing to a repository, they need a way to get changes other people have made which they can merge into their own working directories.

Let’s say someone else on my team adds a function for calculating factorials.

They push their change to the pymaths repository. To merge their changes into my local copy, I can use the Git pull command.

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths (master)
$ git pull origin master
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 3 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.
From https://github.com/jasongorman/pymaths
* branch master -> FETCH_HEAD
3e39188..23578b7 master -> origin/master
Updating 3e39188..23578b7
Fast-forward
sqrt.py | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)

That overwrites my local copy of sqrt.py with the new version from pymaths, which is fine if I haven’t also made pending changes to that file. What if we’ve both changed that file? That can lead to what’s called a merge conflict.

Imagine my team mates adds a function for calculating the ceiling of a number, and pushes that change to pymaths.

And at the same time, I add a function to my local copy for calculating the floor of a number and commit that to my local repository.

When I pull the changes from the remote repository, Git will attempt to merge the two versions of the file automatically, but in this case it fails.

User@DESKTOP-KSHARRN MINGW64 /c/python_projects/maths (master)
$ git pull origin master
remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (1/1), done.
remote: Total 3 (delta 1), reused 3 (delta 1), pack-reused 0
Unpacking objects: 100% (3/3), done.
From https://github.com/jasongorman/pymaths
* branch master -> FETCH_HEAD
87973fb..421547e master -> origin/master
Auto-merging sqrt.py
CONFLICT (content): Merge conflict in sqrt.py
Automatic merge failed; fix conflicts and then commit the result.

To resolve the conflict, I just need to edit the auto-merged file, and then commit and push the finished version to pymaths.

In a sense, version control is like seat belts for programmers. It gives us a level of safety when we’re creating and evolving our programs that means we can invent and try new things with much greater confidence that – whatever happens – there’s a way back if it goes wrong.

Version control systems like Git make it much easier for programmers to collaborate on the same code projects, even if they are on other sides of the world. They have built-in features that help us to manage conflicting changes, and are an essential ingredient in individual and team efforts of all sizes.

Here are some basic good habits for version control that I’ve been successfully applying for 25 years:

  1. Unless it’s something genuinely trivial that you’re going to throw away, always start by putting your code project under version control
  2. Check in your changes frequently – at least every hour (I do it many times an hour, usually). The less often you do it, the more work you might lose.
  3. When working with others, merge their changes frequently into your local copy so you can keep up to date with what’s in the repository, and spot conflicts early when they’re easier to fix.
  4. Use meaningful commit messages to help you and other programmers easily identify what’s changed in that version of the code.
  5. Very importantly, don’t check in code that doesn’t work. If you break the program, and check in your changes, and then your team mates merge those changes into their copies, then you’ve just done the programming equivalent of giving everyone in the team your cold.

But how do we know it works before we commit? Well, we test it. Yes. Every single time before we commit. If it fails our tests, then we don’t commit it.

“Gee. Testing the program every time we want to commit our changes? And you say we should be committing frequently? That sounds like we’ll be spending all our time just testing our code!”

Yup. You’ll be testing and re-testing your program many times an hour. And in the next blog post I’ll be showing you how that can be achieved using fast-running automated unit tests.

 

 

 

 

 

 

 

 

 

 

Code Craft : Part I – Why We Need Code Craft

I started programming many years ago at the beginning of the home computing boom of the 1980s. Computers back then had hundreds of thousands of times less memory and processing power than your smartphone does today, so the programs we wrote for our ZX Spectrums, Commodore 64s and Acorn Micros were necessarily small.

Commodore-64-Computer-FL
The Commodore 64 (source: Wikipedia)

Under the limitations of having just a few kilobytes of memory for our code, programming was a pretty manageable affair. We could fit the whole program in our heads, so to speak.

When Dad brought home the IBM-compatible PC he’d been using in the office – it had been replaced with a newer model – all that changed. Memory leapt from the 64K of our C64 to 2MB, and disk space added a further 20MB for program files.

turbo_c
Turbo C was a popular programming tool for the IBM PC (source: https://turbo-c.soft32.com/)

Like a plant that gets re-potted in a much larger container, my code had a sudden growth spurt – luxuriating in the seemingly unlimited resources of the PC, and in the vastly superior programming languages and tools that were available for it.

In code, size is everything. A simple Commodore 64 game written in hundreds of lines of BASIC is a very different proposition to a game with thousands or tens of thousands of lines of code. That won’t fit inside your head. As the code grows, you quickly hit your limitations of brain power and human memory.

Debugging large programs is really hard. There’s just so much more that can go wrong, and so many more places to look for the sources of problems. It’s the proverbial “needle in a haystack”. Debugging a C64 program was like looking for a needle in a matchbox.

The time taken to sufficiently test a 300-line program is peanuts compared to the testing you need to do for a 30,000-line program. Minutes turns into days or even weeks of clicking buttons to see if the program does what you expect it to across hundreds of functions, each potentially with dozens of different scenarios to consider.

The time it takes to test our programs is a huge factor in how easy it is to change the code. Studies show that the longer a bug goes undiscovered, the more it will cost to fix it. If I make a boo-boo in the code and spot it straight away, it’s a moment’s work to correct it. If that boo-boo makes it on to end users’ computers, fixing it is a much bigger deal. It’s all about the size of the loop we have to go through to schedule time to do the fix, examine the code and find the cause of the bug, fix the bug, re-test the program and then release it back to the end users with the fix in place. If re-testing takes weeks, then that’s one expensive bug fix!

Cost-of-Correcting-Defects-Boehm-and-Basili
(Source: https://slcontrols.com/justify-early-extra-investment-reduce-late-budget-overruns/)

We’re not just talking about coding errors, either. The most common kind of program bug – and one of the most expensive to fix – is when we wrote the wrong code. That is, we misunderstood – or just guessed at – what the end users wanted to do with the program and built the wrong thing. In my early career as a programmer, that happened a lot. Writing code is a very expensive way to find out what the users want, especially if your code is hard to change afterwards.

Changing large programs without breaking something is super hard. In a computer program, all the pieces are connected – directly or indirectly – so changing just one line of code can accidentally break a whole bunch of stuff. And as the program grows, with more and more interconnected parts, it gets harder and harder.

costOfChangeTraditional
Scott W. Ambler – Traditional Cost of Changing Software (source: http://www.agilemodeling.com/essays/costOfChange.htm)

While I happily made the leap to much more memory, and much faster processors, and much more grown-up programming languages, I can now look back with the benefit of nearly 40 years of coding experience and see that the software I was writing back then was rigid – difficult to change – and brittle – easy to break – and buggy as heck.

I was a kid who’d been building tiny houses with Lego, who’d now progressed to building much bigger houses with real bricks and real cement and real timber – all the while thinking that building real houses was just like building Lego houses. NEWS FLASH: it isn’t.

lego_house
Photo by Shadowman39 (Source: https://www.flickr.com/photos/99304214@N05/15756936161)

There’s a lot, lot more that can go wrong building real houses, especially if you expect people to live in them. When you build people-sized houses, you have to think about a bunch of things that you don’t need to think about when you build Lego houses. Steps have to be taken to ensure the structural integrity of a house at all stages of construction and beyond. Otherwise, it can collapse under its own weight, causing expensive damage and even loss of life.

house_building
Photo by David Martin (Source: https://www.geograph.org.uk/photo/6046008)

Ditto with large computer programs. There’s a bunch of things we need to think about for a 10,000-line or a 100,000-line or a 10,000,000 program that just aren’t an issue on a 200-line program.

In particular, we have to take steps to avoid having our big program collapse under its own weight, when making a single change causes it to break in unexpected, and potentially dangerous, ways – depending on who’s using it and what they’re using it for.

It was only a few years into my career as a professional programmer that I learned how to write code that was reliable and easy to change without breaking – code that people can safely live in (both the end users, and other programmers coming to change the code as their users’ needs changed.)

And here’s the thing; while we’re building our software, we are also living in the code. Like plasterers or carpenters or electricians working inside a house while it’s being built, we too are at risk from the thing collapsing on us. This is something it took me quite a while to appreciate. It’s not just about releasing good code that other people can live in. It’s about keeping the code that way while we’re writing it.

Many studies done on computer programming over the last few decades clearly show that code that’s rigid and brittle and buggy takes longer to get working in the first place.

software_quality_at_top_speed
Less buggy programs take less time to deliver. Graph from ‘Software Quality At Top Speed’ – Steve McConnell

Sure, for those first – easy – few hundred lines of code, we can go fast and don’t need to take a lot of care. But the effect of the size of our growing code hits us sooner than we might think, and soon we’re spending all our time trying to debug it to make it usable enough for a release. Many programming projects end with a “stabilisation” phase, where programmers work long hours debugging thousands and thousands of lines of code, trying to hit a deadline to make the software good enough for people to use.

We call this code-and-fix programming. We write a whole bunch of code fast as we can. Then we test it and find a tonne of bugs we didn’t realised we’d introduced. Then we spend a whole lot more time trying to remove those bugs. (And very probably introducing all new bugs while we do that. And around we go.)

Code-and-fix may work on small programs, but it’s often a disaster on larger programs. It’s by far the most time-consuming and expensive way to get programs of any significant size working. And, even after all the debugging, it tends to produce programs that still contain many bugs.

After we’ve released our program, we’re not done yet. It’s in the nature of computer programs that when people use them, they see ways they could be improved. That first release is usually just the start of a long learning process, figuring out what users really need. So they’ll want a second release, and a third, and a fourth, and on and on it tends to go. Programming at scale is a marathon, not a sprint.

If our program code is rigid and brittle, changing it without breaking it is going to be very difficult. On the second go round, it takes even more time and effort to produce a working program. On the third, harder still. Far too many programs hit a barrier where the code is so hard and so risky to change that nobody dares try. At this point we face a difficult decision.

Do we leave it as it is, and the users will just have to struggle on without the changes they need? This is something that holds a lot of businesses back. If you’re Acme Supermarket, and you need to change how your tills work – but you can’t change the software – you have a big problem.

Do we write the program again from scratch? This means that all the program functions your end users currently rely on will have to be rebuilt from the ground up just to get one or two new functions. That’s like building a completely new house just so you can add a porch. Very expensive, and the users will have to wait a long time for their changes.

Do we abandon it altogether, and leave the end users to find their own solutions (or pack up and go home)? I’ve seen businesses do this in extreme cases. The part of their business that relied on legacy software that was hopelessly out of date for their needs, but too expensive to change and too expensive to rewrite, was simply shut down. “We can’t keep up with the competition and their whizzy new software, so we give up.”

And you might think that I’m just talking about computer programs that are written for businesses. But the reality is that, in my own personal programming projects, I’ve faced these decisions because I didn’t take enough care over my code. A 10,000-line program I wrote in C to help with a hobby music project at university had to be abandoned because I was spending all my spare time fixing it. It just got too much. So I ditched not just the code, but the whole project. Months of my life wasted.

Over the 37 years I’ve been coding, I shudder to think how much of my time was wasted debugging code that needn’t have been buggy in the first place. How much time did I waste redoing work I had to throw away because I made infrequent back-ups? How much time did I waste rewriting whole sections of programs because I hadn’t understood what the end users were asking me to build? How much time did I waste trying to understand my own code after I’d come back to it weeks or months later? How much time did I waste re-testing programs by hand?

Most importantly, what else could I have done with all that wasted time?

We’re talking thousands and thousands of hours, probably. Thousands of hours of debugging. Thousands of hours redoing stuff I broke because I didn’t have a recent back-up. Thousands of hours staring at code trying to understand what it does. Thousands of hours going round in circles testing programs by running them and clicking lots of buttons, fixing bugs, and then finding a bunch of new bugs when I re-tested it.

All that changed when I learned some basic code craft after 13 years programming the hard way.

I learned to use version control, checking my code in at least once an hour, so if I hit a dead end, I can easily get back to a working version, losing at most an hour’s work.

I learned to write fast-running automated unit tests, so I could re-test large programs with hundreds of functions in minutes or even seconds, alerting me immediately if I break something.

I learned to write code that people can understand, keeping it as simple as possible and carefully choosing names for functions and data that clearly explained what that piece of code does.

I learned to break large programs down into small, manageable and easy-to-understand chunks (modules) that do one distinct job, and how to compose large programs out of these simple pieces so that a change to one doesn’t break a bunch of connected modules.

I learned how to change code safely in tiny micro-steps, running my unit tests after each change, to keep the code working at all times.

I learned to communicate with end users using examples to pin down exactly what they’re asking for, and I translate those examples directly into tests so I can get immediate feedback on whether the program is doing what the customer wants.

I learned to continually test that changes I’ve made to the code work with changes any other programmers have made at the same time, and to test that it not only works on my computer, but on other computers, too.

And I learned to use automated scripts to build and deploy programs so that a change the users ask for in the morning can be running on their computers by lunchtime if necessary. Since the code is always working, and since I and other programmers on my team are continuously merging our changes into our version control repository, this means that our code is always ready to be released.

These techniques:

  • Version Control
  • Unit Testing
  • Simple Design
  • Modular Design
  • Refactoring
  • Specification By Example
  • Test-Driven Development
  • Continuous Integration
  • Continuous Delivery

…are the foundations of code craft. Master them, and you’ll waste far less time debugging, less time staring at code trying to understand what it does, less time redoing work because you didn’t make a recent back-up or because you misunderstood the requirements, and less time rewriting entire programs from scratch – leaving far more time for the fun stuff, like inventing, being creative, making your end users happy, and having a life outside programming.