What’s The Point of Code Craft?

A conversation I seem to have over and over again – my own personal Groundhog Day – is “What’s the point in code craft if we’re building the wrong thing?”

The implication is that there’s a zero-sum trade-off between customer collaboration – the means by which we figure out what’s needed – and technical discipline. Time spent writing unit tests or refactoring duplication or automating builds is time not spent talking with our customers.

This is predicated on two falsehoods:

  • It takes longer to deliver working code when we apply more technical discipline
  • The fastest way to solve a problem is to talk about it more

All the evidence we have strongly suggests that, in the majority of cases, better quality working software doesn’t take significantly longer to deliver. In fact, studies using large amounts of industry data repeatedly show the inverse. It – on average – takes longer to deliver software when we apply less technical discipline.

Teams who code and fix their software tend to end up having less time for their customers because they’re too busy fixing bugs and because their code is very expensive to change.

Then there’s the question of how to solve our customers’ problems. We can spend endless hours in meetings discussing it, or we could spend a bit of time coming up with a simple idea for a solution, build it quickly and release it straight away for end users to try for real. The feedback we get from people using our software tends to tell us much more, much sooner about what’s really needed.

I’ll take a series of rapid software releases over the equivalent series of requirements meetings any day of the week. I’ve seen this many times in the last three decades. Evolution vs Big Design Up-Front. Rapid iteration vs. Analysis Paralysis.

The real customer collaboration happens out in the field (or in the staging environment), where developers and end users learn from each small, frequent release and feed those lessons back into the next iteration. The map is not the terrain.

Code craft enables high-value customer collaboration by enabling rapid, working releases and by delivering code that’s much easier to change. Far from getting in the way of building the right thing, it is the way.

But…

That’s only if your design process is truly iterative. Teams that are just working through a backlog may see things differently, because they’re not setting out to solve the customer’s problem. They’re setting out to deliver a list of features that some people sitting in a room – these days very probably not the end users and not the developers themselves – guessed might solve the problem (if indeed, the problem was ever discussed).

In that situation, technical discipline won’t help you deliver the right thing. But it could help you deliver the wrong thing sooner for less expense. #FailFast

 

 

Refactoring To Closures in Kotlin & IntelliJ

I spent yesterday morning practicing a refactoring in Kotlin I wanted to potentially demonstrate for a workshop, and after half a dozen unsuccessful attempts, I found a way that seems relatively safe. I thought it might be useful to document it here, both for myself for the future and anyone else who might be interested.

My goal here is to encapsulate the data used in this function for calculating quotes for fitted carpets. The solution I’m thinking of is closures.

How do I get from this to closures being injected into quote() safely? Here’s how I did it in IntelliJ.

  1. Use the Function to Scope… refactoring to extract the body of, say, the roomArea() function into an internal function.

2. As a single step, change the return type of roomArea() to a function signature that matches area(), return a reference to ::area instead of the return value from area(), and change quote() to invoke the returned area() function. (Phew!)

3. Rename roomArea() to room() so it makes more sense.

4. In quote(), highlight the expression room(width, length) and use the Extract Parameter refactoring to have that passed into quote() from the tests.

5. Now we’re going to do something similar for carpetPrice(), with one small difference. Next, as with roomArea(), use the Function to Scope refactoring to extract the body of carpetPrice() into an internal function.

6. Then swap the return value with a reference to the ::price function.

7. Now, this time we want the area to be passed in as a parameter to the price() function. Extract Parameter area from price(), change the signature of the returned function and update quote() to pass it in using the area() function. Again, this must be a single step.

8. Change the Signature of carpetPrice() to remove the redundant area parameter.

9. Rename carpetPrice() to carpet() so it makes more sense.

10. Extract Parameter for the expression carpet(pricePerSqrMtr, roundUp) in quote() called price()

 

If you want to have a crack at this yourself, the source code is at https://github.com/jasongorman/kotlin_simple_design/tree/master/fp_modular_design , along with two more examples (an OO/Java version of this, plus another example that breaks all the rules of both Simple Design and modular design in Kotlin.

 

Action->Object vs. Object->Action

One of the factors that I see programmers new to objects struggling with is our natural tendency to separate agency from data. Things do things to other things. The VCR plays the video. The toaster heats the toast. The driver drives the taxi. Etc.

I think it’s possibly linguistic, too, that we – in most natural languages – but the object after the verb: play(video), toast(bread), drive(taxi).

Thing is, though – this isn’t how object oriented programming works. Objects encapsulate agency with the data it works on, producing video.play(), bread.toast() and taxi.drive().

In OOP, the cat kicks its own arse.

You’re absolutely correct if you’re thinking “That isn’t how we’d say or write it in real life”. It isn’t. I suspect this is one of the reasons some programmers find OOP counter-intuitive – it goes against the way we see the world.

Ironically, Object thinking – while not intuitive in that sense – makes discovery of actions much easier. What can I do with a video? What can I do with bread? And so forth. That’s why Object->Action still dominates UI design. Well, good UI design, anyway. Likewise, developers tend to find it easier to discover functions that can be applied to types when they start with the type.

When I wrote code to tell the story of what happens when a member donates a video to a community library, each line started with a function – well, in Java, a static method, which is effectively the same thing. This is not great OOP. Indeed, it’s not OOP. It’s FP.

And that’s fine. Functional Programming works more the way we say things in the real world. Clean the dishes. Set the timer. Kick the cat. I suspect this is one reason why more and more programmers are draw to the functional paradigm – it works more the way we think, and reads more the way we talk. Or, at least, it can if we’re naming things well.

(There’s a separate discussion about encapsulation in FP. The tendency is for functional programmers not to bother with, which leads to inevitable coupling problems. That’s not because you can’t encapsulate data in FP. It’s just that, as a concept, it’s not been paid much attention.)

If you’re doing OOP – and I still do much of the time, because it’s perfectly workable, thank you very much – then it goes Object->Action. Methods like play(video) and kick(cat) hint at responsibilities being in the wrong place, leading to the lack of encapsulation I witness in so much OO code.

It’s like they say; give a C programmer C++, and they’ll write you a C program with it.

 

 

 

Do Your Unit Tests Know *Too* Much?

A common pitfall of extensive unit testing reported by many teams is that, as the number of tests builds up, changing the implementation under test forces them to rewrite many, many tests. In this scenario, the test code becomes a barrier to change instead of its enabler.

Having witnessed this quite a few times first-hand, I’ve got a few observations I want to share.

First of all, it’s actually usually a problem caused by unmanaged dependencies in our implementation code that causes changes to ripple out to large numbers of tests.

Imagine we wrote a unit test for every public method or function in our implementation, and we decide to change the way one of them works. If that breaks a hundred of our unit tests, my first thought might be “Why is this method/function referenced in a hundred tests?” Did we write a hundred distinct tests for that one thing, or is it used in the set-up of a hundred tests? That would imply that there’s no real separation of concerns in that part of the implementation. A lack of modularity creates all kinds of problems that show up in test code – usually in the set-ups.

The second thing I wanted to mention is duplication in test code. There’s a dangerous idea that’s been gaining in popularity in recent years that we shouldn’t refactor any duplication out of our tests. The thinking is that it can make our tests harder to understand, and there’s some merit to this when it’s done with little thought.

But there are ways to compose tests, reuse set-ups, assertions and whole tests that clearly communicate what’s going on (many well described in the xUnit Test Patterns book). Inexperienced programmers often struggle with code that’s composed out of small, simple parts, and it almost always comes down to poor naming.

Composition – like separation of concerns – needs to be a black box affair. If you have to look inside the box to understand what’s going on, then you have a problem. I’m as guilty of sloppy naming as anyone, and that’s something I’m working on to improve at.

There’s one mechanism in particular for removing duplication from test code that I’ve been championing for 20 years – parameterised tests. When we have multiple examples of the same rule or behaviour covered by multiple tests, it’s a quick win to consolidate those examples into a single data-driven test that exercises all of those cases. This can help us in several ways:

  • Removes duplication
  • Offers an opportunity to document the rule instead of the examples (e.g., fourthFibonacciNumberIsTwo(), sixthFibonacciNumberIsFive() can become fibonacciNumberIsSumOfPreviousTwo() )
  • It opens a door to much more exhaustive testing with surprisingly little extra code

Maybe those 100 tests could be a handful of parameterised tests?

The fourth thing I wanted to talk about is over-reliance on mock objects. Mocks can be a great tool for achieving cohesive, loosely-coupled modules in a Tell, Don’t Ask style of design – I’m under the impression that’s why they were originally invented. But as they give with one hand, they can take away with the other. The irony with mocks is that, while they can lead us to better encapsulation, they do so by exposing the internal interactions of our modules. A little mocking can be powerful design fuel, but pour too much of it on your tests and you’ll burn the house down.

And the final thing I wanted to highlight is the granularity of our tests. Do we write a test for every method of every class? Do we have a corresponding test fixture for every module in the implementation? My experience has been that it’s neither necessary nor desirable to have test code that sticks to your internal design like cheese melted into a radio.

At the other extreme, many teams have written tests that do everything from the outside – e.g., from a public API, or at the controller or service level. I call these “keyhole tests”, because when I’ve worked with them it can feel a little like keyhole surgery. My experience with this style of testing is that, for sure, it can decouple your test code from internal complexity, but at a sometimes heavy price when tests fail and we end up in the debugger trying to figure out where in the complex internal – and non-visible – call stack things went wrong. It’s like when the engineer from the gas company detects a leak somewhere in your house by checking for a drop in pressure at the meter outside. Pinpointing the source of the leak may involve ripping up the floors…

The truth, for me, lies somewhere in between melted cheese and keyhole surgery. What I strive for within a body of code is – how can I put this? – internal APIs. These are interfaces within the implementation that encapsulate the inner complexity of a particular behaviour (e.g, the cluster of classes used to calculate mortgage repayments). They decouple that little cluster from the rest of the implementation, just as their dependencies are decoupled from them in the same S.O.L.I.D. way. And their interfaces tend to be stable, because they’re not about the details. Tests written around those interfaces are less likely to need to change, but also more targeted to a part of the design instead of the whole call stack. So when they fail, it’s easier to pinpoint where things went wrong. (Going back to the gas leak example, imagine having multiple test points throughout the house, so we can at least determine what room the leak is in.)

Even for a monolith, I aim to think of good internal architecture as a network of microservices, each with its own tests. By far the biggest cause of brittle tests that I’ve seen is that your code quite probably isn’t like that. It’s a Big Ball of Mud. That can be exacerbated by leaving all the duplication in the test code, and/or by over-reliance on mock objects, plus a tendency to try to write tests for every method or function of every class.

You want tests that run fast and pinpoint failures, but that also leave enough wiggle room to easily refactor what’s happening behind those internal APIs.

Old Dogs, Old Tricks

Given what I do for a living, I tend to get pretty much daily practice at the fundamentals of code craft like TDD and refactoring. And, after 27 years of professional programming, I would probably still make time every day or once a week to mindfully practice this stuff.

Why? Because – from experience – I believe that these aren’t skills you learn once and then retain indefinitely. On occasions when I’ve been away from code for an extended period, coming back to it I found myself doing sloppy work. I’d forget to do things. I’d fallen out of certain habits.

And I also don’t believe these are binary skills. There’s unit testing, and then there’s unit testing. Not all unit tests are created equal. Some are easier to understand than others. Some are less coupled to the implementation than others. Some run faster than others. Some cover more examples with less code than others. Some teams say “We do unit testing” and their code is still riddled with bugs. Some say “We do unit testing” and their code hasn’t had a bug reported since it went live.

No matter how good we think we are at writing tests or making design decisions or refactoring our code and so on, there’s almost always room for improvement. And as we gain experience in these skills, improving takes more and more work.

As a guitar player of some 30 years, I know that the hour or two of practice I get maybe twice a week these days barely keeps me at the level of technical ability I reached 20 years ago. Most of the visible progress I made happened in the first 5 years of playing. But there’s no top to that hill. If I stop practicing for a while – which I have for the last year or so for various reasons – the ball rolls back down. It takes some ongoing effort just to keep it where it is.

It feels the same with code craft: so many managers who came from programming backgrounds tell me that they lost their skills after a few years away from the code. Indeed, many put themselves on my courses just to try and keep their hand in. Use it or lose it.

An hour or two of mindful practice keeps me where I am as a programmer. It takes an hour or two more to move the ball uphill a bit. This is why I’m such a strong advocate of 20% time for developers. At the very least, half a day a week needs to be set aside to learn, to practice, and to try new things. Or an hour a day. Or however you want to slice it.

Putting aside no time – and I’ve seen this so many times – doesn’t just lead to dev teams who don’t improve, it leads to dev teams who progressively get worse. The code craft muscles atrophy, habits get lost, teams get sloppy, code gets crappy, and delivery slows down to a crawl.

If you prefer a mechanical analogy, think of development as a steamboat. Your goal is to deliver goods from London to New York, sailing every two weeks back and forth across the Atlantic. How much time do you set aside to maintain the engine? None? That’s just not sustainable. How many trips could you expect to make before the engine failed?

The developers on your team are the delivery engine of your software and systems. Their skills – not just technical – need regular maintenance, or the engine breaks down. Set aside some time and put aside some budget to help keep the team in good delivery shape. Not just the occasional training day once or twice a year; a regular investment in building and maintaining their skills. Once a week, or an hour a day – whatever works in your organisation.

(You may be thinking “Hey, we pay them to work. They can learn in their own time.” And that might explain why your efforts to increase diversity on your teams haven’t worked.)

My closing message here is that it’s not just junior programmers who need to work on the fundamentals. We all do.

Here’s a tip from an old dog: a great way to reinforce knowledge and skill is to teach them. My advice for managers is to combine regular practice with mentoring. Your more experienced developers mentoring new hires will find – as I have – that the mentoring process forces us to really think about not just what we’re doing, but why we’re doing it. And it encourages us to be perhaps that little bit more exacting about how we do do it, because we’re trying to set a good example. It’s us as programmers, but in our Sunday best.

Just 3 1-day Client Site Workshops Left

A quick plug for the remaining 1-day client site workshops I’m running to celebrate 10 years of Codemanship.

These code craft training workshops are for groups of up to 20 developers, and can be delivered anywhere in the UK for the never-to-be-repeated price of £1,995. That’s as little as £99.75 per person.

Dates remaining are Thurs Aug 15th, Tues Aug 27th and Thurs Aug 29th. You can choose from an action-packed TDD workshop, refactoring workshop or unit testing workshop. All hands-on and jammed full of practical and useful ideas.

Booking’s easy: whip out that company credit card and visit our Eventbrite page.

Codemanship Exercise “Play-Through” – CD Warehouse Part #4

In yesterday’s post, I implemented the feature to allow record labels to send batches of CDs in to the warehouse. Our design now is starting to grow to a point where compromises have to be made in the design – meaning that, at times, we have to choose between, say, making sure classes have a single responsibility and making sure classes fully encapsulate their data.

It’s very important to make our code as clean as we can, but it’s also important to remember that perfection on anything more complex than the most trivial problems isn’t possible. We might aim for perfection, but we have to accept we’ll never reach it.

This is a nuanced message. Many developers see it as justification for not aiming for perfection – indeed, for not even trying to keep their code clean in some cases. Other developers, myself included, hear “Aim for the stars and you’ll reach the Moon”.

Perfectionism has been given a bad rap in software development. Managers in particular often have an irrational fear of developers “taking quality too far”. Anyone who’s dedicated a few years to looking at people’s code will know that nobody out there is producing software that’s too good.

Keeping code clean is hard. And the more the code grows and ages, the harder it gets: like trying to stay in shape in your forties is harder than in your thirties. Developers usually greatly underestimate how much attention and effort needs to go into keeping code easily maintainable, and teams typically under-invest in it.

Having said all that, it is possible to over-invest: to devote too much thought and effort to studying and refactoring our designs. Like in life: we need to invest in our health and fitness if we want to live a long and full existence. But we also need to get on with life – living, loving, reading, traveling, dancing, painting, snowboarding, and whatever else is the thing that gets us up in the morning.

Software’s a bit like that. We need to invest in giving it a long useful life. But we also need to get shit done. Finding that balance continues to be a subject of great interest (and frustration) to me. How much refactoring is too much? A how little refactoring is too little? Ultimately, what matters is the end result – how much value did we deliver, and for how long?

Anyway, enough navel gazing. Let’s pick up where we left off yesterday and get this shit done.

We’re nearly at the end of our list of tests for the CD warehouse.

  • Buy a CD
    • Payment accepted
    • Payment rejected
    • Out of stock
  • Search for a CD (by title and artist)
    • Warehouse has matching title
    • No match found
  • Receive batch of CDs
    • Of existing title in catalogue
    • Of new title (add to catalogue)
    • Of multiple titles
  • Review a CD
    • Just a rating
    • Rating and review

Just two more tests to pass. Okay. Now we know what CDs a customer has bought, we can move on to reviewing a CD. There are a couple of happy paths in our test list:

  • Review a CD
    • Just a rating
    • Rating and review

The simplest case is when they just leave a rating. Let’s write a failing test for that.

To pass this test, we simply push the new review on to the CDs reviews list.

Now let’s add text to our review in our next failing test.

This is easy to pass.

So that’s us done, then. Isn’t it? We’ve crossed off the last two tests on our list.

But there may be one or two little wrinkles we need to think about. Firstly, the requirements said that customers could leave reviews for CDs they’ve bought. What if we try to leave a review for a CD we didn’t buy?

We currently don’t record who bought a CD, and our model has no concept of customer. So there’s a couple more tests we need to add to our list. First, when a CD is bought, we need to make a record of it for the customer who bought it.

Passing this test is simples.

We’re on a green light again. Let’s review the code. I’m not happy about CD accessing the customer’s CD collection directly (or the test, for that matter). Let’s extract some methods and move them to Customer to encapsulate this.

Now that we know what CDs a customer has bought from us, we can add an edge case for when a customer tries to leave a review for a CD that’s not in their collection.

And to pass this, we just need to check if the customer has that CD.

Now we’re back on a green light, let’s review this new code. I’m not happy about the message chain I’ve left: review.customer.has(this). Let’s encapsulate that in the Review class. (This refactoring is called Hide Delegate).

Okay. We’re nearly there. One last question for our real customer: what are these ratings for? As it stands, a CD can have a list of reviews with ratings. But do we expect our users to go through the list adding them up to calculate an average rating? I think the software should do that. Let’s add one more failing test.

To pass this test, we just sum the ratings and divide by their number.

Okay. Back on a green light. Let’s do a mini code review. Well, I’m not convinced my code for calculating the average rating’s entirely obvious. Let’s try extracting a function for calculating the sum.

Next on our checklist is duplication. There is some duplicated set-up in the tests, but only a small amount, and we’re on the edge of the Rule of Three here, in that I could invest time in extracting set-up methods, but would that investment pay off considering that we’re actually done? I’m going to leave it in their, favouring readability.

All the methods and classes in our code are pretty simple – no loops inside loops inside IFs or anything like that.

But I do now think – yet again – that CD has more than one job now. Let’s extract a class for handling reviews.

In my design now, CD does one job, and simply acts as a facade to the other responsibilities, shielding any clients from knowledge of the little subsystem it’s now composed of.

Now we should check through the methods in our classes for any examples of Feature Envy we may have added. Nothing to report there. (Remember, there are examples of Feature Envy that we knowingly introduced as side-effects of splitting apart cohesive responsibilities so that classes do just one job.)

Next, let’s check that the dependencies in a classes are swappable by dependency injection. The quickest way to do this in JavaScript is to look for imports. Aside from a factory class, none of the other classes in our model have direct imports. All of their dependencies are given to them from the outside. And we can see that all the imports are at the top of call stack, in the tests. e.g.

Good modular designs wear their dependencies on the outside.

Finally, let’s check that non of our modules are referencing things they don’t use. In JavaScript, this again means checking the imports.

In this test module, WebStorm tells me that OutOfStockError is unused. Let’s organise our imports and get rid of it.

And that, folks, is my play-through of the CD warehouse exercise from the Codemanship TDD course. No doubt, you’ll have noticed some things while we did it. For example, in some places I chose a defensive style of programming, raising errors if pre-conditions for buying a CD and for leaving a review weren’t met. In other places, I assumed that the client code (e.g., the UI) wouldn’t invoke methods on the model if the precondition was broken (e.g., asking for the average of zero CD ratings).

This was based on edge cases that were either discussed with the customer or not discussed. If we agreed that only people who’ve bought a CD can review it, there’s multiple ways we can enforce that in the code. The way I prefer is to not give the end user that choice if it’s not valid. This can be done by good UI design, for example. Maybe the user is only presented with a “Review” button for CDs that are in their collection.

Anyhoo, that’s a discussion for another time.

 

If you’d like to see the complete JS code for my CD warehouse, it’s here.