A Programmer’s “Breadboard”?

I’m a fan of rapid feedback. When I write software, I prefer to find out if I’m on the right track as soon as possible.

There are all sorts of techniques I’ve tried for getting customer feedback without having to go to the effort of delivering production-quality software, which is a time-consuming and expensive way of getting feedback.

I’ve tried UI wire frames and storyboards, which do tend to get the message across, but suffer from two major drawbacks: one is that we can’t run them and see if they work on a real problem, and the other is that we have to commit to a UI design early in order to explore the logic of our software. Once a design’s been “made flesh”, it tends to stick, even if much better designs are possible.

I’ve tried exploring with test cases – examples, basically – but have found that often too abstract for customers to get a real sense of the design.

The thing I’ve tried that gets the most valuable feedback really is to put working software in front of end users and let them try it for themselves. But, like I said, working software is expensive to create.

In the 1990s, when Rapid Development was at its peak, tools appeared that allowed us to “slap together” working prototypes quickly. They typically had names with “Visual” or “Builder” in them, and they worked by dragging and dropping GUI components onto forms or windows or panels, and we could bind those controls to databases and add a little code or a simple macro to glue it all together into something that kind of sort of does what we need.

Then we would take our basic prototype to the customer (sometimes we actually created it in front of the customer), and let them take it for a spin. In fairly rapid iterations, we’d refine the prototype based on the customer’s feedback until we got it in the ballpark.

Then – and this is where it all went wrong – we’d say “Okay, great. Now can we have £500,000 to build it properly in C++?” And they’d say “No, thanks. I’m good with this version.” And then we’d ship the prototype that was made of twigs and string (and Visual Basic for Applications) and live with the very expensive consequences of allowing a business to rely on software that isn’t production quality. (Go to any bank and ask them how many Excel spreadsheets they rely on for enterprise, mission-critical applications. It’ll boggle your mind.)

The pioneers of Extreme Programming learned from this that we should never put working software in front of our customers that isn’t production-ready, because if they like it they’ll make us ship it.

Sketches don’t suffer this drawback, because we very obviously can’t ship a drawing on a whiteboard or in Visio. (Although we did try briefly in the late 90s and early 00s.)

Now, in electronics, it’s possible to create a working prototype that is very obviously not a finished product and that no customer would tell you to ship. Here’s guitar pedal designer Dean Wampler showing us his breadboard that he uses to explore pedal designs.

This is the thing with a breadboard guitar pedal: you can plug a guitar into one end and plug the output into a real guitar amp and hear how the pedal will sound.

It looks nothing like a finished pedal you would buy in a shop, and you certainly couldn’t take it on the road. A production-quality Wampler pedal is much smaller, much more robust and much more user-friendly.

wampler

Of course, these days, it’s entirely possible to design a pedal on a computer and simulate how it will sound. Some guitar amp manufacturers design their amps that way. But you still want players to be able to plug in their guitar and see how it sounds (and how it feels to play through it, which is tricky with software simulations because of latency.)

So breadboards in guitar electronics design persist. There’s no substitute for the real thing (until virtual breadboards catch up).

And this all got me to thinking: what do we have that’s like a breadboard?

Pictures won’t cut it because users can’t run a picture and play around with it. They have to use their imaginations to interpret how the software will respond to their actions. It’s like showing someone a set of production design sketches and asking them “So, how did you like the movie?”

High-fidelity prototypes won’t cut it because customers make us ship them, and the business landscape is already drowning in legacy systems from the 90s that were only intended to be for illustration purposes.

I’m thinking is there a way of throwing together a working app quickly for customer feedback – Microsoft Access-style – but that doesn’t bind us to a UI design too early, and very obviously can’t be shipped? And, very usefully, that could be evolved into a production-quality version without necessarily having to rewrite the whole thing from scratch.

Right now, nothing springs to mind.

(Talking of Scratch…)

 

 

The Value’s Not In Features, It’s In Learning

A small poll I ran on the Codemanship Twitter account seems to confirm what I’ve observed and heard in the field about “agile” development teams still being largely plan-driven.

If you’re genuinely feedback-driven, then your product backlog won’t survive much further out than the next cycle of customer feedback. Maintaining backlogs that look months ahead is a sign that just maybe you’re incrementally working through a feature list instead of iteratively solving a set of business problems.

And this cuts to the core of a major, fundamental malaise in contemporary Agile Software Development. Teams are failing to grasp that the “value” that “flows” in software development is in what we learn with each iteration, not in the features themselves.

Perhaps a better name for “features” might be “guesses” – we’re guessing what might be needed to solve a problem. We won’t know until we’ve tried, though. So each release is a vital opportunity to test our assumptions and feed back what we learn into the next release.

I see teams vigorously defending their product backlogs from significant change, and energetically avoiding feedback that might reveal that we got it wrong this time. Folk have invested a lot in the creation of their backlog – often envisioning a whole product in significant detail – and can take it pretty personally when the end users say “Nope, this isn’t working”.

With a first release – when our code meets the real world for the first time – I expect a lot of change to a product vision. With learning and subsequent iterations of the design, the product vision will usually stabilise. But when we track how much the backlog changes with each release on most teams, we see mostly tweaking. Initial product visions – which, let’s be clear, are just educated guesses at best – tend to remain largely intact. Once folk are invested in a solution, they struggle to let go of it.

Teams with a strong product vision often suffer from confirmation bias when considering end user feedback. (Remember: the “customer” on many products is just as invested in the product vision if they’ve actively participated in its creation.) Feedback that supports their thesis tends to be promoted. Feedback that contradicts their guesswork gets demoted. It’s just human nature, but its skewing effect on the design process usually gets overlooked.

The best way to avoid becoming wedded to a detailed product vision or plan is not to have a detailed product vision or plan. Assume as little as possible to make something simple we can learn from in the next go-round. Focus on achieving long-term goals, not on delivering detailed plans.

In simpler terms: ditch the backlogs.

Codemanship’s Code Craft Road Map

One of the goals behind my training courses is to help developers navigate all the various disciplines of what we these days call code craft.

It helps me to have a mental road map of these disciplines, refined from three decades of developing software professionally.

codecraftroadmap

When I posted this on Twitter, a couple of people got in touch to say that they find it helpful, but also that a few of the disciplines were unfamiliar to them. So I thought it might be useful to go through them and summarise what they mean.

  • Foundations – the core enabling practices of code craft
    • Unit Testing – is writing fast-running automated tests to check the logic of our code, that we can run many times a day to ensure any changes we’ve made haven’t broken the software. We currently know of no other practical way of achieving this. Slow tests cause major bottlenecks in the development process, and tend to produce less reliable code that’s more expensive to maintain. Some folk say “unit testing” to mean “tests that check a single function, or a single module”. I mean “tests that have no external dependencies (e.g., a database) and run very fast”.
    • Version Control – is seat belts for programmers. The ability to go back to a previous working version of the code provides essential safety and frees us to be bolder with our code experiments. Version Control Systems these days also enable more effective collaboration between developers working on the same code base. I still occasionally see teams editing live code together, or even emailing source files to each other. That, my friends, is the hard way.
    • Evolutionary Development – is what fast-running unit tests and version control enable. It is one or more programmers and their customers collectively solving problems together through a series of rapid releases of a working solution, getting it less wrong with each pass based on real-world feedback. It is not teams incrementally munching their way through a feature list or any other kind of detailed plan. It’s all about the feedback, which is where we learn what works and what doesn’t. There are many takes on evolutionary development. Mine starts with a testable business goal, and ends with that goal being achieved. Yours should, too. Every release is an experiment, and experiments can fail. So the ability to revert to a previous version of the code is essential. Fast-running unit tests help keep changes to code safe and affordable. If we can’t change the code easily, evolution stalls. All of the practices of code craft are designed to enable rapid and sustained evolution of working software. In short, code craft means more throws of the dice.
  • Team Craft – how developers work together to deliver software
    • Pair Programming – is two programmers working side-by-side (figuratively speaking, because sometimes they might not even be on the same continent), writing code in real time as a single unit. One types the code – the “driver” – and one provides high-level directions – the “navigator”. When we’re driving, it’s easy to miss the bigger picture. Just like on a car journey, in the days before GPS navigation. The person at the wheel needs to be concentrating on the road, so a passenger reads the map and tells them where to go. The navigator also keeps an eye out for hazards the driver may have missed. In programming terms, that could be code quality problems, missing tests, and so on – things that could make the code harder to change later. In that sense, the navigator in a programming pair acts as a kind of quality gate, catching problems the driver may not have noticed. Studies show that pair programming produces better quality code, when it’s done effectively. It’s also a great way to share knowledge within a team. One pairing partner may know, for example, useful shortcuts in their editor that the other doesn’t. If members of a team pair with each other regularly, soon enough they’ll all know those shortcuts. Teams that pair tend to learn faster. That’s why pairing is an essential component of Codemanship training and coaching. But I appreciate that many teams view pairing as “two programmers doing the work of one”, and pair programming can be a tough sell to management. I see it a different way: for me, pair programming is two programmers avoiding the rework of seven.
    • Mob Programming – sometimes, especially in the early stages of development, we need to get the whole team on the same page. I’ve been using mob programming – where the team, or a section of it, all work together in real-time on the same code (typically around a big TV or projector screen) – for nearly 20 years. I’m a fan of how it can bring forward all those discussions and disagreements about design, about the team’s approach, and about the problem domain, airing all those issues early in the process. More recently, I’ve been encouraging teams to mob instead of having team meetings. There’s only so much we can iron out sitting around a table talking. Eventually, I like to see the code. It’s striking how often debates and misunderstandings evaporate when we actually look at the real code and try our ideas for real as a group. For me, the essence of mob programming is: don’t tell me, show me. And with more brains in the room, we greatly increase the odds that someone knows the answer. It’s telling that when we do team exercises on Codemanship workshops, the teams that mob tend to complete the exercises faster than the teams who work in parallel. And, like pair programming, mobbing accelerates team learning. If you have junior or trainee developers on your team, I seriously recommend regular mobbing as well as pairing.
  • Specification By Example – is using concrete examples to drive out a precise understanding of what the customer needs the software to do. It is practiced usually at two levels of abstraction: the system, and the internal high-level design of the code.
    • Test-Driven Development – is using tests (typically internal unit tests) to evolve the internal design of a system that satisfies an external (“customer”) test. It mandates discovery of internal design in small and very frequent feedback loops, making a few design decisions in each feedback loop. In each feedback loop, we start by writing a test that fails, which describes something we need the code to do that it currently doesn’t. Then we write the simplest solution that will pass that test. Then we review the code and make any necessary improvements – e.g. to remove some duplication, or make the code easier to understand – before moving on to the next failing test. One test at a time, we flesh out a design, discovering the internal logic and useful abstractions like methods/functions, classes/modules, interfaces and so on as we triangulate a working solution. TDD has multiple benefits that tend to make the investment in our tests worthwhile. For a start, if we only write code to pass tests, then at the end we will have all our solution code covered by fast-running tests. TDD produces high test assurance. Also, we’ve found that code that is test-driven tends to be simpler, lower in duplication and more modular. Indeed, TDD forces us to design our solutions in such a way that they are testable. Testable is synonymous with modular. Working in fast feedback loops means we tend to make fewer design decisions before getting feedback, and this tends to bring more focus to each decision. TDD, done well, promotes a form of continuous code review that few other techniques do. TDD also discourages us from writing code we don’t need, since all solution code is written to pass tests. It focuses us on the “what” instead of the “how”. Overly complex or redundant code is reduced. So, TDD tends to produce more reliable code (studies find up to 90% less bugs in production), that can be re-tested quickly, and that is simpler and more maintainable. It’s an effective way to achieve the frequent and sustained release cycles demanded by evolutionary development. We’ve yet to find a better way.
    • Behaviour-Driven Development – is working with the customer at the system level to precisely define not what the functions and modules inside do, but what the system does as a whole. Customer tests – tests we’ve agreed with our customer that describe system behaviour using real examples (e.g., for a £250,000 mortgage paid back over 25 years at 4% interest, the monthly payments should be exactly £1,290) – drive our internal design, telling us what the units in our “unit tests” need to do in order to deliver the system behaviour the customer desires. These tests say nothing about how the required outputs are calculated, and ideally make no mention of the system design itself, leaving the developers and UX folk to figure those design details out. They are purely logical tests, precisely capturing the domain logic involved in interactions with the system. The power of BDD and customer tests (sometimes called “acceptance tests”) is how using concrete examples can help us drive out a shared understanding of what exactly a requirement like “…and then the mortgage repayments are calculated” really means. Automating these tests to pull in the example data provided by our customer forces us to be 100% clear about what the test means, since a computer cannot interpret an ambiguous statement (yet). Customer tests provide an outer “wheel” that drives the inner wheel of unit tests and TDD. We may need to write a bunch of internal units to pass an external customer test, so that outer wheel will turn slower. But it’s important those wheels of BDD and TDD are directly connected. We only write solution code to pass unit tests, and we only write unit tests for logic needed to pass the customer test.
  • Code Quality – refers specifically to the properties of our code that make it easier or harder to change. As teams mature, their focus will often shift away from “making it work” to “making it easier to change, too”. This typically signals a growth in the maturity of the developers as code crafters.
    • Software Design Principles – address the underlying factors in code mechanics that can make code harder to change. On Codemanship courses, we teach two sets of design principles: Simple Design and Modular Design.
      • Simple Design
        • The code must work
        • The code must clearly reveal it’s intent (i.e., using module names, function names, variable names, constants and so on, to tell the story of what the code does)
        • The code must be low in duplication (unless that makes it harder to understand)
        • The code must be the simplest thing that will work
      • Modular Design (where a “module” could be a class, or component, or a service etc)
        • Modules should do one job
        • Modules should know as little about each other as possible
        • Module dependencies should be easy to swap
    • Refactoring – is the discipline of improving the internal design of our software without changing what it does. More bluntly, it’s making the code easier to change without breaking it. Like TDD, refactoring works in small feedback cycles. We perform a single refactoring – like renaming a class – and then we immediately re-run our tests to make sure we didn’t break anything. Then we do another refactoring (e.g., move that class into a different package) and test again. And then another refactoring, and test. And another, and test. And so on. As you can probably imagine, a good suite of fast-running automated tests is essential here. Refactoring and TDD work hand-in-hand: the tests make refactoring safer, and without a significant amount of refactoring, TDD becomes unsustainable. Working in these small, safe steps, a good developer can quite radically restructure the code whilst ensuring all along the way that the software still works. I was very tempted to put refactoring under Foundation, because it really is a foundational discipline for any kind of programming. But it requires a good “nose” for code quality, and it’s also an advanced skill to learn properly. So I’ve grouped it here under Code Quality. Developers need to learn to recognise code quality problems when they see them, and get hundreds of hours of practice at refactoring the code safely to eliminate them.
    • Legacy Code – is code that is in active use, and therefore probably needs to be updated and improved regularly, but is too expensive and risky to change. This is usually because the code lacks fast-running automated tests. To change legacy code safely, we need to get unit tests around the parts of the code we need to change. To achieve that, we usually need to refactor that code to make it easy to unit test – i.e., to remove external dependencies from that code. This takes discipline and care. But if every change to a legacy system started with these steps, over time the unit test coverage would rise and the internal design would become more and more modular, making changes progressively easier. Most developers are afraid to work on legacy code. But with a little extra discipline, they needn’t be. I actually find it very satisfying to rehabilitate software that’s become a millstone around our customers’ necks. Most code in operation today is legacy code.
    • Continuous Inspection – is how we catch code quality problems early, when they’re easier to fix. Like anything with the word “continuous” in the title, continuous inspection implies frequent automated checking of the code for cod quality “bugs” like functions that are too big or too complicated, modules with too many dependencies and so on. In traditional approaches, teams do code reviews to find these kinds of issues. For example, it’s popular these days to require a code review before a developer’s changes can be merged into the master branch of their repo. This creates bottlenecks in the delivery process, though. Code reviews performed by people looking at the code are a form of manual testing. You have to wait for someone to be available to do it, and it may take them some time to review all the changes you’ve made. More advanced teams have removed this bottleneck by automating some or all of their code reviews. It requires some investment to create an effective suite of code quality gates, but the pay-off in speeding up the check-in process usually more than pays for it. Teams doing continuous inspection tend to produce code of a significantly higher quality than teams doing manual code reviews.
  • Software Delivery – is all about how the code we write gets to the operational environment that requires it. We typically cover it in two stages: how does code get from the developer’s desktop into a shared repository of code that could be built, tested and released at any time? And how does that code get from the repository onto the end user’s smartphone, or the rented cloud servers, or the TV set-top box as a complete usable product?
    • Continuous Integration – is the practice of developers frequently (at least once a day) merging their changes into a shared repository from which the software can be built, tested and potentially deployed. Often seen as purely a technology issue – “we have a build server” – CI is actually a set of disciplines that the technology only enables if the team applies them. First, it implies that developers don’t go too long before merging their changes into the same branch – usually the master branch or “trunk”. Long-lived developer branches – often referred to as “feature branches” – that go unmerged for days prevent frequent merging of (and testing of merged) code, and is therefore most definitely not CI. The benefit of frequent tested merges is that we catch conflicts much earlier, and more frequent merges typically means less changes in each merge, therefore less merge conflicts overall. Teams working on long-lived branches often report being stuck in “merge hell” where, say, at the end of the week everyone in the team tries to merge large batches of conflicting changes. In CI, once a developer has merged their changes to the master-branch, the code in the repo is built and the tests are run to ensure none of those changes has “broken the build”. It also acts as a double-check that the changes work on a different machine (the build server), which reduces the risk of configuration mistakes. Another implication of CI – if our intent is to have a repository of code that can be deployed at any time – is that the code in master branch must always work. This means that developers need to check before they merge that the resulting merged code will work. Running a suite of good automated tests beforehand helps to ensure this. Teams who lack those tests – or who don’t run them because they take too long – tend to find that the code in their repo is permanently broken to some degree. In this case, releases will require a “stabilisation” phase to find the bugs and fix them. So the software can’t be released as soon as the customer wants.
    • Continuous Delivery – means ensuring that our software is always shippable. This encompasses a lot of disciplines. If the is code sitting on developers’ desktops or languishing in long-lived branches, we can’t ship it. If the code sitting in our repo is broken, we can’t ship it. If there’s no fast and reliable way to take the code in the repo and deploy it as a working end product to where it needs to go, we can’t ship it. As well as disciplines like TDD and CI, continuous delivery also requires a very significant investment in automating the delivery pipeline – automating builds, automating testing (and making those test run fast enough), automating code reviews, automating deployments, and so on. And these automated delivery processes need to be fast. If your builds take 3 hours – usually because the tests take so long to run – then that will slow down those all-important customer feedback loops, and slow down the process of learning from our releases and evolving a better design. Build times in particular are like the metabolism of your development process. If development has a slow metabolism, that can lead to all sorts of other problems. You’d be surprised how often I’ve seen teams with myriad difficulties watch those issues magically evaporate after we cut their build+test time down from hours to minutes.

Now, most of this stuff is known to most developers – or, at the very least, they know of them. The final two headings caused a few scratched heads. These are more advanced topics that I’ve found teams do need to think about, but usually after they’ve mastered the core disciplines that come before.

  • Managing Code Craft
    • The Case for Code Craft – acknowledges that code craft doesn’t exist in a vacuum, and shouldn’t be seen as an end in itself. We don’t write unit tests because, for example, we’re “professionals”. We write unit tests to make changing code easier and safer. I’ve found it helps enormously to both be clear in my own mind about why I’m doing these things, as well as in persuading teams that they should try them, too. I hear it from teams all the time: “We want to do TDD, but we’re not allowed”. I’ve never had that problem, and my ability to articulate why I’m doing TDD helps.
    • Code Craft Metrics – once you’ve made your case, you’ll need to back it up with hard data. Do the disciplines of code craft really speed up feedback cycles? Do they really reduce bug counts, and does that really save time and money? Do they really reduce the cost of changing code? Do they really help us to sustain the pace of innovation for longer? I’m amazed how few teams track these things. It’s very handy data to have when the boss comes a’knockin’ with their Micro-Manager hat on, ready to tell you how to do your job.
    • Scaling Code Craft – is all about how code craft on a team and within a development organisation just doesn’t magically happen overnight. There are lots of skills and ideas and tools involved, all of which need to be learned. And these are practical skills, like riding a bicycle. You can;t just read a book and go “Hey, I’m a test-driven developer now”. Nope. You’re just someone who knows in theory what TDD is. You’ve got to do TDD to learn TDD, and lot’s of it. And all that takes time. Most teams who fail to adopt code craft practices do so because they grossly underestimated how much time would be required to learn them. They approach it with such low “energy” that the code craft learning curve might as well be a wall. So I help organisations structure their learning, with a combination of reading, training and mentoring to get teams on the same page, and peer-based practice and learning. To scale that up, you need to be growing your own internal mentors. Ad hoc, “a bit here when it’s needed”, “a smigen there when we get a moment” simply doesn’t seem to work. You need to have a plan, and you need to invest. And however much you were thinking of investing, it’s not going to be enough.
  • High-Integrity Code Craft
    • Load-Bearing Code – is that portion of code that we find in almost any non-trivial software that is much more critical than the rest. That might be because it’s on an execution path for a critical feature, or because it’s a heavily reused piece of code that lies on many paths for many features. Most teams are not aware of where their load-bearing code is. Most teams don’t give it any thought. And this is where many of the horror stories attributed to bugs in software begin. Teams can improve at identifying load-bearing code, and at applying more exhaustive and rigorous testing techniques to achieve higher levels of assurance when needed. And before you say “Yeah, but none of our code is critical”, I’ll bet a shiny penny there’s a small percentage of your code that really, really, really needs to work. It’s there, lurking in most software, just waiting to send that embarrassing email to everyone in your address book.
    • Guided Inspection – is a powerful way of testing code by reading it. Many studies have shown that code inspections tend to find more bugs than any other kind of testing. In guided inspections, we step through our code line by line, reasoning about what it will do for a specific test case – effectively executing the code in our heads. This is, of course, labour-intensive, but we would typically only do it for load-bearing code, and only when that code itself has changed. If we discover new bugs in an inspection, we feed that back into an automated test that will catch the bug if it ever re-emerges, adding it to our suite of fast-running regression tests.
    • Design By Contract – is a technique for ensuring the correctness of the interactions between components of our system. Every interaction has a contract: a pre-condition that describes when a function or service can be used (e.g., you can only transfer money if your account has sufficient funds), and a post-condition that describes what that function or service should provide to the client (e.g., the money is deducted from your account and credited to the payee’s account). There are also invariants: things that must always be true if the software is working as required (e.g., your account never goes over it’s limit). Contracts are useful in two ways: for reasoning about the correct behaviour of functions and services, and for embedding expectations about that behaviour inside the code itself as assertions that will fail during testing if an expectation isn’t satisfied. We can test post-conditions using traditional unit tests, but in load-bearing code, teams have found it helpful to assert pre-conditions to ensure that not only do functions and services do what they’re supposed to, but they’re only ever called when they should be. DBC presents us with some useful conceptual tools, as well as programming techniques when we need them. It also paves the way to a much more exhaustive kind of automated testing, namely…
    • Property-Based Testing – sometimes referred to as generative testing, is a form of automated testing where the inputs to the tests themselves are programmatically calculated. For example, we might test that a numerical algorithm works for a range of inputs from 0…1000, at increments of 0.01. or we might test that a shipping calculation works for all combinations of inputs of country, weight class and mailing class. This is achieved by generalising the expected results in our tests, so instead of asserting that the square root of 4 is 2, we might assert that the square root of any positive number multiplied by itself is equal to the original number. These properties of correct test results look a lot like the contracts we might write when we practice Design By Contract, and therefore we might find experience in writing contracts helpful in building that kind of declarative style of asserting. The beauty of property-based tests is that they scale easily. Generating 1,000 random inputs and generating 10,000 random inputs requires a change of a single character in our test. One character, 9,000 extra test cases. Two additional characters (100,000) yields 99,000 more test cases. Property-based tests enable us to achieve quite mind-boggling levels of test assurance with relatively little extra test code, using tools most developers already know.

So there you have it: my code craft road map, in a nutshell. Many of these disciplines are covered in introductory – but practical – detail in the Codemanship TDD course book

If your team could use a hands-on introduction to code craft, our 3-day hands-on TDD course can give them a head-start.

Iterating Is The Ultimate Requirements Discipline

The title of this blog post is something I’ve been trying to teach teams for many years now. As someone who very much drank the analysis and design Kool Aid of the 1990s, I learned through personal experience on dozens of projects – and from observing hundreds more from a safe distance – that time spent agonising over the system spec is largely time wasted.

A requirements specification is, at best, guesswork. It’s our starter for ten. When that spec – if the team builds what’s been requested, of course – meets the real world, all bets are usually off. This is why teams need more throws of the dice – as many as possible, really – to get it right. Most of the value in our code is added after that first production release, if we can incorporate our users’ feedback.

Probably the best way to illustrate this effect is with some code. Take a look at this simple algorithm for calculating square roots.

public static double sqrt(double number) {
    if(number == 0) return 0;
    double t;

    double squareRoot = number / 2;

    do {
        t = squareRoot;
        squareRoot = (t + (number / t)) / 2;
    } while ((t - squareRoot) != 0);

    return squareRoot;
}

When I mutation test this, I get a coverage report that says one line of code in this static method isn’t being tested.

pit

The mutation testing tool turned number / 2 into number * 2, and all the tests still passed. But it turns out that number * 2 works just as well as the initial input for this iterative algorithm. Indeed, number * number works, and number * 10000000 works, too. It just takes an extra few loops to converge on the correct answer.

It’s in the nature of convergent iterative processes that the initial input matters far less than the iterations. More frequent iterations will find a working solution sooner than any amount of up-front analysis and design.

This is why I encourage teams to focus on getting working software in front of end users sooner, and on iterating that solution faster. Even if your first release is way off the mark, you converge on something better soon enough. And if you don’t, you know the medicine’s not working sooner and waste a lot less time and money barking up the wrong mixed metaphor.

What I try to impress on teams and managers is that building it right is far from a ‘nice-to-have’. The technical discipline required to rapidly iterate working software and to sustain the pace of releases is absolutely essential to building the right thing, and it just happens to be the same technical discipline that produces reliable, maintainable software. That’s a win-win.

Iterating is the ultimate requirements discipline.

 

Code Craft’s Value Proposition: More Throws Of The Dice

Evolutionary design is a term that’s used often, not just in software development. Evolution is a way of solving complex problems, typically with necessarily complex solutions (solutions that have many interconnected/interacting parts).

But that complexity doesn’t arise in a single step. Evolved designs start very simple, and then become complex over many, many iterations. Importantly, each iteration of the design is tested for it’s “fitness” – does it work in the environment in which it operates? Iterations that don’t work are rejected, iterations that work best are selected, and become the input to the next iteration.

We can think of evolution as being a search algorithm. It searches the space of all possible solutions for the one that is the best fit to the problem(s) the design has to solve.

It’s explained best perhaps in Richard Dawkins’ book The Blind Watchmaker. Dawkins wrote a computer simulation of a natural process of evolution, where 9 “genes” generated what he called “biomorphs”. The program would generate a family of biomorphs – 9 at a time – with a parent biomorph at the centre surrounded by 8 children whose “DNA” differed from the parent by a single gene. Selecting one of the children made it the parent of a new generation of biomorphs, with 8 children of their own.

biomorph
Biomorphs generated by the evolutionary simulation at http://www.emergentmind.com/biomorphs

You can find a recreation and more detailed explanation of the simulation here.

The 9 genes of the biomorphs define a universe of 118 billion possible unique designs. The evolutionary process is a walk through that universe, moving just one space in any direction – because just one gene is changing with each generation – with each iteration. From simple beginnings, complex forms can quickly arise.

A brute force search might enumerate all possible solutions, test each one for fitness, and select the best out of that entire universe of designs. With Dawkins’ biomorphs, this would mean testing 118 billion designs to find the best. And the odds of selecting the best design at random are 1:118,000,000,000. There may, of course, be many viable designs in the universe of all possible solutions. But the chances of finding one of them with a single random selection – a guess – are still very small.

For a living organism, that has many orders of magnitude more elements in their genetic code and therefore an effectively infinite solution space to search, brute force simply isn’t viable. And the chances of landing on a viable genetic code in a single step are effectively zero. Evolution solves problems not by brute force or by astronomically improbable chance, but by small, perfectly probable steps.

If we think of the genes as a language, then it’s not a huge leap conceptually to think of a programming language in the same way. A programming language defines the universe of all possible programs that could be written in that language. Again, the chances of landing on a viable working solution to a complex problem in a single step are effectively zero. This is why Big Design Up-Front doesn’t work very well – arguably at all – as a solution search algorithm. There is almost always a need to iterate the design.

Natural evolution has three key components that make it work as a search algorithm:

  • Reproduction – the creation of a new generation that has a virtually identical genetic code
  • Mutation – tiny variances in the genetic code with each new generation that make it different in some way to the parent (e.g., taller, faster, better vision)
  • Selection – a mechanism for selecting the best solutions based on some “fitness” function against which each new generation can be tested

The mutations from one generation to the next are necessarily small. A fitness function describes a fitness landscape that can be projected onto our theoretical solution space of all possible programs written in a language. Programs that differ in small ways are more likely to have very similar fitness than programs that are very different. Make one change to a working solution and, chances are, you’ve still got a working solution. Make 100 changes, and the risk of breaking things is much higher.

Evolutionary design works best when each iteration is almost identical to that last, with only one or two small changes. Teams practicing Continuous Delivery with a One-Feature-Per-Release policy, therefore, tend to arrive at better solutions than teams who schedule many changes in each release.

And within each release, there’s much more scope to test even smaller changes – micro-changes of the kind enacted in, say, refactoring, or in the micro-iterations of Test-Driven Development.

Which brings me neatly to the third component of evolutionary design: selection. In nature, the Big Bad World selects which genetic codes thrive and which are marked out for extinction. In software, we have other mechanisms.

Firstly, there’s our own version of the Big Bad World. This is the operating environment of the solution. A Point Of Sale system is ultimately selected or rejected through real use in real shops. An image manipulation program is selected or rejected by photographers and graphic designers (and computer programmers writing blog posts).

Real-world feedback from real-world use should never be underestimated as a form of testing. It’s the most valuable, most revealing, and most real form of testing.

Evolutionary design works better when we test our software in the real world more frequently. One production release a year is way too little feedback, way too late. One production release a week is far better.

Once we’ve established that the software is fit for purpose through customer testing – ideally in the real world – there are other kinds of testing we can do to help ensure the software stays working as we change it. A test suite can be thought of as a codified set of fitness functions for our solution.

One implication of the evolutionary design process is that, on average, more iterations will produce better solutions. And this means that faster iterations tend to arrive at a working solution sooner. Species with long life cycles – e.g., humans or elephants – evolve much slower than species with short life cycles like fruit flies and bacteria. (Indeed, they evolve so fast that it’s been observed happening in the lab.) This is why health organisations have to guard against new viruses every year, but nobody’s worried about new kinds of shark suddenly emerging.

For this reason, anything in our development process that slows down the iterations impedes our search for a working solution. One key factor in this is how long it takes to build and re-test the software as we make changes to it. Teams whose build + test process takes seconds tend to arrive at better solutions sooner than teams whose builds take hours.

More generally, the faster and more frictionless the delivery pipeline of a development team, the faster they can iterate and the sooner a viable solution evolves. Some teams invest heavily in Continuous Delivery, and get changes from a programmer’s mind into production in minutes. Many teams under-invest, and changes can take weeks or months to reach the real world where the most useful feedback is to be had.

Other factors that create delivery friction include the maintainability of the code itself. Although a system may be complex, it can still be built from simple, single-purpose, modular parts that can be changed much faster and more cheaply than complex spaghetti code.

And while many BDUF teams focus on “getting it right first time”, the reality we observe is that the odds of getting it right first time are vanishingly small, no matter how hard we try. I’ll take more iterations over a more detailed requirements specification any day.

When people exclaim of code craft “What’s the point of building it right if we’re building the wrong thing?”, they fail to grasp the real purpose of the technical practices that underpin Continuous Delivery like unit testing, TDD, refactoring and Continuous Integration. We do these things precisely because we want to increase the chances of building the right thing. The real requirements analysis happens when we observe how users get on with our solutions in the real world, and feed back those lessons into a new iteration. The sooner we get our code out there, the sooner can get that feedback. The faster we can iterate solutions, the sooner a viable solution can evolve. The longer we can sustain the iterations, the more throws of the dice we can give the customer.

That, ultimately, is the promise of good code craft: more throws of the dice.

 

Code Craft is More Throws Of The Dice

On the occasions founders ask me about the business case for code craft practices like unit testing, Continuous Integration and refactoring, we get to a crunch moment: will this guarantee success for my business?

Honestly? No. Nobody can guarantee that.

Unit testing can’t guarantee that. Test-Driven Development can’t guarantee that. Refactoring can’t guarantee it. Automated builds can’t guarantee it. Microservices can’t. The Cloud can’t. Event sourcing can’t. NoSQL can’t. Lean can’t. Scrum can’t. Kanban can’t. Agile can’t. Nothing can.

And that is the whole point of code craft. In the crap game of technology, every release of our product or system is almost certainly not a winning throw of the dice. You’re going to need to throw again. And again. And again. And again.

What code craft offers is more throws of the dice. It’s a very simple value proposition. Releasing working software sooner, more often and for longer improves your chances of hitting the jackpot. More so than any other discipline in software development.

No, But Seriously…

My blog’s been going for 14 years (including the blog at the old location), and it seems I’ve posted many times on the topic of developers engaging directly with their end users. I’ve strongly recommended it many, many times.

I’m not talking about sitting in meeting rooms asking “What software would you like us to build?” That’s the wrong question. If your goal is to build effective solutions, we need to build a good understanding of the problems we’re setting out to solve.

My whole approach to software development is driven by problems – understanding them, designing solutions for them, testing those solutions in the real (or as real as possible) world, and feeding back lessons learned into the next design iteration.

That, to me, is software development.

And I’ve long held that to really understand our end users, we must become them. Even if it’s just for a short time. We need to walk a mile in their shoes, eat our own dog food, and any other euphamism for experiencing what it’s like to do their job using our software.

Traditional business and requirements analysis techniques – with which I’m very familiar – are completely inadequate to the task. No number of meetings, boxes and arrows, glossaries and other analysis paraphernalia will come close to seeing it and experiencing it for yourself.

And every time I say this, developers nod their heads and agree that this is sage advice indeed. And then they don’t do it. Ever.

In fact, many developers – at the suggestion of spending time actually embedded in the business, seeing how the busness works and the problems the business faces – run a mile in the opposite direction. Which is a real shame, because this really is – hands down – the most useful thing we could do. Trying to solve problems we don’t understand is a road to nowhere.

So, I’ll say it again – and keep saying it.

Developers – that’s you, probably – need to spend real, quality time embedded with their end users, seeing how they work, seeing how they use our software, and experiencing all of that for ourselves. It should be a regular thing. It should be the norm. Don’t let a business analyst take your place. Experienccing it second or third-hand is no substitute for the real thing. If the goal is to get the problem into your head, then your head really needs to be there.

If your software is used internally within the business, embed in those teams. If your software’s used by external customers, become one of them. And spend time in the sales team. Spend time in the support team. Spend time in the marketing team. Find out for yourself what it’s really like to live with the software you create. I’m always amazed at how many dev teams literally have no idea.

Likely as not, it will radically transform the way you think about your product and your development processes.