Automate, Automate, Autonomy!

Thanks to pandemic-induced economic chaos, you’ve been forced to take a job on the quality assurance line at a factory that produces things.

The machine creates all kinds of random things, but your employer only sells a very specific subset of those things. All the things that don’t fit the profile have to be rejected, and melted down and fed back into the machine to make more things.

On your first day, you get training. (Oh, would that were true in software development!)

They stand you at the quality gate and start up the machine. All kinds of things come down the line at you. Your line manager tells you “Only let the green things through”. You grab all the things that aren’t green and throw them into the recycle bin. So far, so good.

“Only let the green round things through!” shouts your line manager. Okay, you think. Bit harder now. All non-green, non-round things go in the bin.

“Only let the green round small things through!” Now you’re really having to concentrate, a few green round small things end up in the bin, and a few non-green, non-round, non-small things get through.

“Only let the green round small things with Japanese writing on them through!” That’s a lot to process at the same time. Now your brain is struggling to cope. A bunch of blue and red things with Japanese writing on them get through. A bunch of square things get through. Your score has gone from 100% accurate to just 90%. Either someone will have to go through the boxes that have been packed and pick out all the rejects, or they’ll have to deal with 10% customer returns after they’ve been shipped.

“Only let the green round small things with Japanese writing on them that have beveled edges and a USB charging port on the opposite side to the writing and a power button in the middle of the writing and a picture of a horse  – not a donkey, mind, reject those ones! – and that glow in the dark through!”

Now it’s chaos. Almost every box shipped contains things that should have been thrown in the recycle bin. Almost every order gets returned. That’s just too much to process. Too many criteria.

We have several choices here:

  1. Slow down the line so we can methodically examine every thing against our checklist, one criteria at a time.
  2. Hire a whole bunch of people and give them one check each to do.
  3. Reset customer expectations about the quality of the things they’re buying.
  4. Automate the checking using cameras and robots and lasers and super-advanced A.I. so all those checks can be made at production speed to a high enough accuracy.

Number 4 is the option that potentially gives us the win-win of customer satisfaction and high productivity without the bigger payroll. It’s been the driving force behind the manufacturing revolutions in East Asia for the last 70 years: automate, automate, automate.

But it doesn’t come for free. High levels of automation require considerable ongoing investment in time, technology and training. In the UK, we’ve under-invested, becoming more and more inefficient and expensive while the quality of our output has declined. Shareholders want their return now. There’s no appetite for making improvements for the future.

There are obvious parallels in software development. Businesses want their software now. Most software organisations have little inclination to invest the time, technology and training required to reach the high levels of automation needed to achieve the coveted continuous delivery that would allow them to satisfy customer needs sooner, cheaper, and for longer.

The inescapable reality is that frictionless delivery demands an investment of 20-25% of your total software development budget. To put it more bluntly, everyone should be spending 1 day a week not on immediate customer requirements, but on making improvements in the delivery process that would mean meeting future customer requirements will be easier.

And so, for most teams, it never gets easier. The software just gets buggier, later and more expensive year after year.

What distinguishes those software teams who are getting it right from the rest? From observation, I’ve seen the same factor every time: autonomy. Teams will invest that 20-25% when it’s their choice. They’re tasked with delivering value, and allowed to figure out how best to do that. Nobody’s telling them how to do their jobs.

How did this blissful state come about? Again, from observation, those teams have autonomy because they took it. Freedom is rarely willingly given.

Now, I appreciate this is a whole can of worms. To take their autonomy, teams need to earn trust. The more trust a team has earned, the more likely they’ll be left alone. And this can be a chicken and egg kind of situation. To earn trust, the team has to reliably deliver. To reliably deliver, the team needs autonomy. This whole process must begin with a leap of faith on the business’s part. In other words, they have to give teams the benefit of the doubt long enough to see the results.

And here come the worms… Teams have to win over their customer from the start, before anything’s been delivered – before the customer’s had a chance to taste our pudding. This means that developers need to inspire enough confidence with their non-technical stakeholders – remember, this is a big money game – to reassure everyone that they’re in good hands. And we’re really, really bad at this.

The temptation is to over-promise, and set unrealistic expectations. This pretty much guarantees disappointment. The best way to inspire confidence is to have a good track record. No lawyer can guarantee to win your case. But a lawyer who won 9 of their last 10 cases is going to inspire more confidence than a lawyer who’s taking this as their first case promising you a win.

And we’re really, really bad at this, too – chiefly because software development teams are newly formed for that specific piece of work and don’t have a track record to speak of. Sure, individual developers may be known quantities, but in software, the unit of delivery is the team. I’ve watched teams of individually very strong developers fall flat on their collective arse.

And this is why I believe that this picture won’t change until organisations start to view teams as assets, and invest in them for a long-term pay-off as well as short-term delivery, 20/80. And, again, I don’t think this will be willingly given. So maybe we – as a profession – need to take the decision out of their hands.

It could all start with one big act of collective autonomy.

 

 

For Distributed Teams, Code Craft is Critical

Right now, most software teams all around the world are working from home. Many have not done it before, and are on a learning curve that means last week’s productivity won’t be returning for a while.

I’ve worked on distributed teams many times, and – through Codemanship – trained and mentored dozens of teams remotely. One thing I’ve learned from all that remote development experience is that coding discipline becomes super-important.

Just as distributed systems amplify every design flaw, turning what would be a headache in a monolith into a major outbreak in a service-oriented architecture, distributed working amplifies team dysfunctions as the communication pathways take on extra weight.

Here’s how code craft can help:

  • Unit tests – keeping the software working is Distributed Dev Team 101. Open Source projects rely on suites of fast-running tests to protect against check-ins that break the code.
  • Continuous Integration – is how distributed teams communicate their changes to each other. Co-located teams should merge their changes to the master branch often and be build aware, keeping one eye on other people’s merges to see what’s changed. But it’s much easier on co-located teams to keep everyone in step because we can see and talk to each other about the changes we’re making. If remote developers do infrequent large merges, integration hell gets amplified tenfold by the extra communication barriers.
  • Test-Driven Development – a lot of the communication between developers, and between developers and their customers, can be handwavy and vague. And if communication is easy – like on a co-located team – we just go around a few more times until we converge on what’s required. But when communication is harder, like in distributed teams, a few more goes around gets very expensive. Using executable tests as specifications removes the ambiguity. It should do exactly this. Also, TDD – done well – produces suites of useful, fast-running automated tests. It’s a win-win.
  • Design Principles – Well-factored code is very important to co-located teams, and super-duper-important to distributed teams. Let’s count the ways:
    • Simple Design
      • Code should work – if it don’t work, we can’t ship it. Any changes that break the code block the team. It’s a big deal on a co-located team, but it’s a really big deal on a distributed team.
      • Code should clearly communicate its intent – code should speak for itself, and when developers are working remotely, and communicating requires extra effort, this is especially true. The easier code is to understand, the less teleconferences required to understand it.
      • Code should be free of duplication – so much duplication in software is duplication of concepts. This often occurs when developers on teams work in isolation, unaware that someone else has already added a module that does what their module also does. Devs need to be aware of duplication in the code – Continuous Integration and merge awareness helps – and clued up to when they should refactor it and when they should leave it alone.
      • Code should be as simple as we can make it – every line of code that has to be maintained as another straw on the camel’s back. When the camel’s back stretches between multiple locations – possibly in multiple time zones – the impact of every additional straw is felt many-fold.
    • Modular Design
      • Modules should do one job – the ability to change the behaviour of a system by just editing one module is critical to a team’s ability to make the changes they need without treading on the toes of other developers. On distributed teams, multiple developers all making changes to one module for multiple reasons can lead to some spectacular merge train wrecks.
      • Modules should hide their internal workings – the more modules are coupled to each other, the bigger and wider the impact of even the smallest changes will be felt. Imagine your distributed team is working precariously balanced on high wires that are all interconnected. What you don’t want is for one person to start violently shaking their wire, sending ripples throughout the network. Or it could all come tumbling down. Again, it’s bad on co-located teams, but it’s Double-Plus-Triple-Word-Score-Bad on distributed teams. Ever dependency can bring pain.
      • Modules should not depend directly on implementations of other modules – it’s good architecture generally for modules not to bind directly to implementations of the other modules they use, for a variety of reasons. But it’s especially important when teams aren’t co-located. Taken together, the first three principles of modular design are better known as “Separation of Concerns”. Or, as I like to call it, the Principle of Somebody Else’s Problem. If my module needs to send an email, I shouldn’t need to know how emails are actually sent – all that detail should be hidden from me – and I should be able to work on my code without having to actually send emails when I test it. Sending emails is somebody else’s problem. It’s particularly useful in a test-driven approach to design to be able to write a test for code that has external dependencies – things it uses that other developers are working on – without actually binding directly to the implementation of that external component so that we can swap in a test double that pretends to do that job. That’s how you scale TDD. That’s how you make TDD work in distributed teams, too.
      • Module interfaces should be designed from the client’s point of view – tied together with TDD, we can specify modules very precisely from the outside: this is what it should look like (interface) and this is what it should do (tests). Imagine your distributed team is making a jigsaw: the hard way to do it is to have each person go off and make a piece of the jigsaw and then hope that they all fit together at the end. The smart way to do it is to define the shapes of the pieces as parts of the whole puzzle, and then have people implement the pieces based in the interfaces and tests agreed. You do this by designing systems from the outside in, defining modules by how they will be used from the client code’s POV. This also helps to restrict public interfaces to only what client’s need to see, hiding internal details, improving encapsulation and reducing coupling. Coupling on distributed teams can be very, very expensive.
    • Refactoring – the still-rather-too-rare discipline of reshaping code without breaking the software is the means by which we achieve good design. Try as we might to never write code that’s hard to understand, or has duplication, or is overly complex, or too tightly coupled, we’ll always need to clean up our code as we go. If the impact of poor design is amplified on distributed teams, the importance of refactoring must be proportionally amplified. The alternative is relying on after-the-fact code reviews (e.g., in GitFlow), which will become multiple times the bottleneck they already were when your team was co-located and you could just pop over to Mary’s desk and ask.

Underpinning all of this is a need for levels of delivery process automation – automated testing, automated builds, automated deployments, automated code reviews – that the majority of teams are nowhere near.

And then there’s the interpersonal: the communication, the coordination, the planning and tracking, the collaborative design. It takes a big investment to make a distributed Agile team as productive as a co-located team.

All the Jiras and GitHubs and cloud-based build pipelines and remote whiteboards and shared IDEs and Zoom meetings in the world won’t save you if the code craft isn’t up to snuff, though. It’s foundational to delivering as a distributed team.

If you want to know more about code craft, visit www.codemanship.com

 

The Experience Paradox

One sentiment I hear very often from managers is how very difficult it is to hire experienced developers. This, of course, is a self-fulfilling prophecy. If you won’t hire developers without experience, how will inexperienced developers get the jobs they need to gain that experience?

I simultaneously hear from new developers – recent graduates, boot camp survivors, and so on – that they really struggle to land that first development job because they don’t have the experience.

When you hear both of these at the same time, it becomes a conversation. Unfortunately, it’s a conversation conducted from inside soundproof booths, and I’m seeing no signs that this very obvious situation will be squared any time soon.

I guess we have to add it to the list of things that every knows is wrong, but everyone does anyway. (Like adding more developers to teams when the schedule’s slipping.)

Organisations should hire for potential as well as experience. People who have the potential to be good developers can learn from people with experience. It’s a match made in heaven, and the end result is a larger pool of experienced developers. This problem can fix it itself, if we were of a mind to let it. We all know this. So why don’t we do it?

At the other end of the spectrum, I hear many, many managers say “This person is too experienced to be a developer”. And at the same time, I hear many, many very experienced developers struggle to find work. This, too, is a problem that creates itself. Typically, there are two reasons why managers rule out the most experienced developers:

  • They expect to get paid more (because they usually achieve more)
  • They won’t put up with your bulls**t

Less experienced developers may be more malleable in terms of how much – or little – you can pay them, how much unpaid overtime they may be willing to tolerate, and how willing they might be to cut corners when ordered to. They may have yet to learn what their work is really worth to you. They may have yet to experience burnout. They may yet to have lived with a large amount of technical debt.

Developers with 20+ years experience, who’ve been around the block a few times and know the score, don’t fit it into the picture of developers as fungible resources.

By freezing out inexperienced developers and very experienced developers, employers create the exact situation they complain endlessly about – lack of experienced developers. If it were my company and my money on the line, I’d hire developers with decades of experience specifically to mentor the inexperienced developers with potential I’d also be hiring.

Many employers, of course, argue that training up developers is too much of a financial risk. Once they’re good enough, won’t they leave for a better job? The clue is in the question, though. They leave for a better job. Better than the one you’re offering them once they qualify. Don’t just be a stepping stone, be a tropical island – a place they would want to stay and work.

If you think training up developers is going to generate teams of cheaper developers who’ll work harder and longer for less, then – yes – they’ll leave at the first opportunity. Finding a better job won’t be hard.

Are Only 1 in 4 Developers Trusted?

Following on from my previous post about Trust & Agility, I ran a straw poll on the @codemanship Twitter account as a sort of litmus test of how much trust organisations put in their developers.

A new monitor is a decision of the same order of magnitude of risk of the proverbial “$100 decision” that I’ve used for many years now when assessing the capacity for trust – and therefore for decentralised decision-making – that agility requires.

If I wanted a new monitor for my home office, that would be entirely my decision. There was, of course, a time when it would not have been. That was when I was a child. If 13-year-old me wanted a new monitor, I would have had to refer that decision up to my parents.

When I got jobs, I made my own money and bought my own monitors. That’s called “being a grown-up”.

I’ve worked in organisations where even junior developers were handed corporate credit cards – with monthly limits – so they could “just go get it”, and not bother management with insignificant details. Smaller decisions get made when they need to be made, and they don’t get caught in a bottleneck. It really works.

So why are 3 out of 4 developers still not trusted to be grown-ups?

UPDATE:

A few people have got in touch giving me the impression that they took this to be literally about buying monitors or about company credit cards. These are just simple examples. They indicate a wider problem often. Developers who I’ve seen wait weeks for a monitor, or a laptop, or a software license, have tended to be the ones who “wanted to do TDD but weren’t allowed”, or who “wanted 20% but the boss said ‘no’ “, or who wanted to go to a conference but it was denied. These tend to be the teams that can’t self-organise to get things done. In my 20 years’ experience of Agile Software Development, the correlation has been very close between teams of developers who have to ask permission for a new monitor – just as an example (IT’S NOT ABOUT BUYING MONITORS!) – and teams who have to ask for permission to automate deployments or to talk to the end users directly.

Trust & Agility

I find it useful to revisit the Manifesto for Agile Software Development when I think I need a reminder of what it was originally all about. One of the 12 principles of Agile software is:

Build projects around motivated individuals.
Give them the environment and support they need,
and trust them to get the job done

Now, it’s just one little word, buried in that text, but in my experience it makes all the difference: trust.

In two decades of kind of sort of doing Agile Software Development, I’ve found one simple metric that has tended to be the best predictor of whether an organisation is capable of agility: how much does it cost here to make a $100 decision?

Let me give you an example. I worked on a contract once where the team needed a couple of whiteboards in the room. I’ve bought those kinds of whiteboards. I know what they cost (a few hundred pounds), and I know they can be acquired within 24 hours very easily.

If I want a whiteboard for my home office, I go online to one of the thousands of retailers who sell them, whip out my credit card, and that whiteboard will be in my office and covered with boxes and arrows the next morning.

In this particular client organisation it took multiple meetings and hours of everyone’s time. It got escalated to the IT director, who – presumably – had nothing more important to think about. We spent thousands of pounds deciding to buy a few hundred pounds-worth of whiteboards, and for several weeks we struggled with design discussions at a critically early stage of the work for the want of a few square metres of something cheap we could draw on. And in the end, the IT director didn’t buy us two basic boards. He bought us a £4,000 electronic whiteboard that we never used. Eventually, real whiteboards did appear, but not before a heap more discussions.

Now, it shouldn’t have come as a surprise, because the organisation in question was in the business of selling shrink-wrapped bureaucracy. But the whiteboard debacle turned out to be an omen of what was to come. I’ve seen that many, many times in my career.

These days, I test for that metric when I get the chance. As a small, niche supplier of training and coaching services, it’s in my interest to try and avoid overly bureaucratic clients. The cost of doing business with them often outweighs the benefits. And, let’s face it, their developers are unlikely to be given the time and support they need to learn code craft. Agility is not for you.

What has this got to do with trust? And what does trust have to do with agility?

It’s simple: the reason we couldn’t just go out and buy whiteboards is because the managers didn’t trust us to make that decision. So we had to wait. And wait. And wait.

Agility demands that the people doing the work, whenever possible, make the decisions about how the work is done. Management have to support them in that, and remove any obstacle getting in the team’s way. Lack of whiteboard space in the early iterations of a design process was an obstacle, created by lack of trust, that cost us time and money and stopped us getting stuff done.

Want stuff to get done? Then you need to trust the people doing it to just get on with it. Escalating what are relatively trivial decisions costs time. A whiteboard. A monitor. More memory.  A WebStorm license. These are all small, low-risk decisions that hold teams back when they’re delayed.

Making small, low-risk decisions quickly is foundational to agility. Fail fast. If an organisation can’t make a $100 decision quickly and easily, then the more it costs them to decide and to act, the further agility will be out of reach.

Sure, if the team wants to move to Sweden, or buy a minibus, that size of decision will need to be discussed at a higher level of budgetary authority. But a whiteboard? Or a laptop? Or – goddamit – stationary?! The sad fact is that far too many teams have absolutely no budgetary discretion at all (unless they want to have a meeting, in which case – bizarrely – the sky’s the limit).

Of course, some will say “Trust has to be earned”, but for teams this is a Catch 22. They have to deliver to earn the trust to make their own decisions, but they’re less likely to deliver if they can’t make those decisions. It’s a self-fulfilling prophecy. Teams who aren’t trusted will struggle to earn trust.

My solution has always been to trust myself and make the decision anyway. This has not gone down well with some managers, who demand compliance from their teams. This is why I and the corporate world have tended to be incompatible, and why I now run my own business and buy my own whiteboards.

Lack of trust is usually a symptom of fear of failure. It’s a learned behaviour built on historical failures – usually by completely different teams of completely different people. When we refuse to give Team A autonomy because Teams B, C, and D failed in the past, then we’re punishing the children for the sins of their ancestors.

It takes courage – one of the four core Agile values – to press the reset button and say “Okay I’m going to let Team A do their thing and I won’t interfere unless they screw up”. It takes even more courage – and honesty – to recognise that a team that’s not delivering might actually be being held back by our lack of trust, and to relinquish some control.

This is why so many organisations fail to achieve agility. Managers are afraid to take their hands off the wheel and let the team drive. So many “Agile transformations” – especially at scale – push in the exact opposite direction, attempting to impose processes and standards that constrain teams while giving managers nothing more than a greater illusion of control. This is the antithesis of agility.

I’ve only seen it work when managers genuinely empower their teams. As terrifying as it may sound to hand a company credit card to people who cover their laptops in Marvin The Martian stickers and come to work on tiny scooters, the alternative is that you accept the inevitable bureaucracy that will result if you won’t trust your teams.

 

Code, Meet World #NoBacklogs

Do we dare to imagine Agile without backlogs? How would that work? How would we know what to build in this iteration? How would we know what’s coming in future iterations?

Since getting into Agile Software Development (and it’s precursors, like DSDM and Extreme Programming), I’ve gradually become convinced that most of us have been doing it fundamentally wrong.

It’s a question of emphasis. What I see is thousands of teams working their way through a product plan. They deliver increments of the plan every week or three, and the illusion of being driven by feedback instead of by the plan is created by showing each increment to a “customer” and asking “Waddaya think?”

In the mid-90s, I worked as the sole developer on a project where my project managers – two of them! – would make me update their detailed plan every week. It was all about delivering the plan, and every time the plan bumped into reality, the whole 6-month plan had to be updated all over again.

This was most definitely plan-driven software development. We planned, planned, and planned again. And at no point did anybody suggest that maybe spending 1 day a week updating a detailed long-term plan might be wasted effort.

Inspired by that experience, I investigated alternative approaches to planning that could work in the real world. And I discovered one in a book by Tom Gilb called Principles of Software Engineering Management. Tom described an approach to planning that chimed with my own experience of how real projects worked in the real world.

It’s a question of complexity. Weather is very, very complicated. And this makes weather notoriously difficult to predict in detail. The further ahead we look, the less accurate our predictions tend to be. What will the weather be like tomorrow? We can take a pretty good guess with the latest forecasting models. What will the weather be like the day after tomorrow? We can take a guess, but it’s less likely to be accurate. What will the weather be like 6 weeks next Tuesday from now? Any detailed prediction is very likely to be wrong. That’s an inherent property of complex systems: they’re unpredictable in the long term.

Software development is also complex. Complex in the code, its thousands of component parts, and the interactions between them. Complex in the teams, which are biological systems. Complex in how the software will interact with the real world. There are almost always things we didn’t think of.

So the idea that we can predict what features a software system will need in detail, looking months – or even just weeks ahead – seemed a nonsense to me.

But, although complex systems can be inherently unpredictable in detail, they tend to be – unless they’re completely chaotic – roughly predictable in general terms.

We can’t tell you what the exact temperature will be outside the Dog & Fox in Wimbledon Village at 11:33am on August 13th 2025, but we can pretty confidently predict that it will not be snowing.

And we can’t be confident that we’ll definitely need a button marked “Sort A-Z” on a web page titled “Contacts” that displays an HTML table of names and addresses, to be delivered 3 months from now in the 12th increment of a React web application. But we can be confident that users will need to find an address to send their Christmas card to.

The further we look into the future, the less detailed our predictions need to become if they are to be useful in providing long-term direction. And they need to be less detailed to avoid the burden of continually updating a detailed plan that we know is going to change anyway.

This was a game-changer for me. I realised that plans are perishable goods. Plans rot. Curating a detailed 6-month plan, to me, was like buying a 6-month supply of tomatoes. You’ll be back at the supermarket within a fortnight.

I also realised that you’ve gotta eat those tomatoes before they go bad. It’s the only way to know if they’re good tomatoes. Features delivered months after they were conceived are likely rotten – full of untested assumptions piled on top of untested assumptions about what the end users really need. In software, details are very highly biodegradable.

So we need to test our features before they go off. And the best place to test them is in the real world, with real end users, doing real work. Until our code meets the real world, it’s all just guesswork.

Of course, in some domains, releasing software into production every week – or every day even – is neither practical nor desirable. I wouldn’t necessarily recommend it for a nuclear power station, for example.

And in these situations where releases create high risk, or high disruption to end users, we can craft simulated release environments where real end users can try the software in an as-real-as-we-can-make-it world.

If detailed plans are only likely to survive until the next release, and if the next release should be at most a couple of weeks away, then arguably we should only plan in detail – i.e., at the level of features – up to the next release.

Beyond that, we should consider general goals instead of features. In each iteration, we ask “What would be the simplest set of features that might achieve this goal?” If the feature set is too large to fit in that iteration, we can break the goal down. We build that feature set, and release for end user testing in the real (or simu-real) world to see if it works.

Chances are, it won’t work. It might be close, but usually there’s no cigar. So we learn from that iteration and feed the lessons back in to the next. Maybe extra features are required. Maybe features need tweaking. Maybe we’re barking up the wrong tree and need to completely rethink.

Each iteration is therefore about achieving a goal (or a sub-goal), not about delivering a set of features. And the output of each release is not features, but what we learn from watching real end users try to achieve their goal using the features. The output of software releases is learning, not features.

This also re-frames the definition of “done”. We’re not done because we delivered the features. We’re only done when we’ve achieved the goal. Maybe we do that in one iteration. Maybe we need more throws of the dice to get there.

So this model of software development sees cross-functional teams working as one to achieve a goal, potentially making multiple passes at it, and working one goal at a time. The goal defines the team. “We’re the team who enables diners to find great Chinese food”. “We’re the team who puts guitar players in touch with drummers in their town.” “We’re the team who makes sure patients don’t forget when their repeat prescriptions need to be ordered.”

Maybe you need a programmer to make that happen. Maybe you need a web designer to make that happen. Maybe you need a database expert to make that happen. The team is whoever you need to achieve that goal in the real world.

Now I look at the current state of Agile, and I see so many teams munching their way through a list of features, and so few teams working together to solve real end users’ problems. Most teams don’t even meet real end users, and never see how the software gets used in the real world. Most teams don’t know what problem they’re setting out to solve. Most teams are releasing rotten tomatoes, and learning little from each release.

And driving all of this, most teams have a dedicated person who manages that backlog of features, tweaking it and “grooming” it every time the plan needs to change. This is no different to my experiences of updating detailed project plans in the 90s. It’s plan-driven development, plain and simple.

Want to get unstuck from working through a detailed long-term plan? Then ditch the backlog, get with your real customer, and start solving real problems together in the real world.

Do We Confuse Work With Value?

Here’s a little thought experiment. Acme Widgets need a new stock control system. They invite software development companies to bid. Big Grey IT Corporation put forward a bid for the system that will take 20 weeks with a team of 50 developers, each developer getting paid £2,000 a week. Their total bid for the system is £2 million.

Acme’s CEO is about to sign off on the work when he gets a call from Whizzo Agile Inc, who have read the tender specification and claim they can deliver the exact same system in the exact same time with a team of just 4 crack developers.

Assuming both teams would deliver the exact same outcome at the exact same time (and, yes, in reality a team of 50 would very probably deliver late or not at all, statistically), how much should Whizzo Agile charge?

It’s a purely hypothetical question, but I’ve seen similar scenarios play out in real life. A government department invites bids for an IT system. The big players put in ludicrously huge bids (e.g., £400 million for a website), and justify them by massive over-staffing of the project. Smaller players – the ones who can actually afford to tender and who have the government contacts – tend to get shut out, no matter how good their bids look.

And I’ve long wondered if the problem here is how we tend to confuse work with value. Does that calculation at the back of the customer’s mind go something like: “Okay, so Option A is £2M for 5,000 developer days, and Option B is £0.5M for 400 developer days”? With Option A, the customer calculates they’ll get more work for their money.

But the value delivered is identical (in theory – in reality a smaller team would probably do a better job).  They get the same software in the same time frame, and they get it at a quarter of the price. It’s just that the developers on the small team get paid a lot more to deliver it. I’ve watched so many managers turn down Option B over the last 30 years.

A company considering a bid of £2M to build a software system is announcing that the value of that system to their business is significantly more than £2M. Let’s say, for the sake of argument, it’s double. Option A brings them a return of £2M. Option B brings a return of £3.5M. In those terms, is it a good business decision to go with Option A?

In other areas of business, choosing to go with Option A would get you sacked. Why, in software development, is it so often the other way around?

 

The Value’s Not In Features, It’s In Learning

A small poll I ran on the Codemanship Twitter account seems to confirm what I’ve observed and heard in the field about “agile” development teams still being largely plan-driven.

If you’re genuinely feedback-driven, then your product backlog won’t survive much further out than the next cycle of customer feedback. Maintaining backlogs that look months ahead is a sign that just maybe you’re incrementally working through a feature list instead of iteratively solving a set of business problems.

And this cuts to the core of a major, fundamental malaise in contemporary Agile Software Development. Teams are failing to grasp that the “value” that “flows” in software development is in what we learn with each iteration, not in the features themselves.

Perhaps a better name for “features” might be “guesses” – we’re guessing what might be needed to solve a problem. We won’t know until we’ve tried, though. So each release is a vital opportunity to test our assumptions and feed back what we learn into the next release.

I see teams vigorously defending their product backlogs from significant change, and energetically avoiding feedback that might reveal that we got it wrong this time. Folk have invested a lot in the creation of their backlog – often envisioning a whole product in significant detail – and can take it pretty personally when the end users say “Nope, this isn’t working”.

With a first release – when our code meets the real world for the first time – I expect a lot of change to a product vision. With learning and subsequent iterations of the design, the product vision will usually stabilise. But when we track how much the backlog changes with each release on most teams, we see mostly tweaking. Initial product visions – which, let’s be clear, are just educated guesses at best – tend to remain largely intact. Once folk are invested in a solution, they struggle to let go of it.

Teams with a strong product vision often suffer from confirmation bias when considering end user feedback. (Remember: the “customer” on many products is just as invested in the product vision if they’ve actively participated in its creation.) Feedback that supports their thesis tends to be promoted. Feedback that contradicts their guesswork gets demoted. It’s just human nature, but its skewing effect on the design process usually gets overlooked.

The best way to avoid becoming wedded to a detailed product vision or plan is not to have a detailed product vision or plan. Assume as little as possible to make something simple we can learn from in the next go-round. Focus on achieving long-term goals, not on delivering detailed plans.

In simpler terms: ditch the backlogs.

Codemanship’s Code Craft Road Map

One of the goals behind my training courses is to help developers navigate all the various disciplines of what we these days call code craft.

It helps me to have a mental road map of these disciplines, refined from three decades of developing software professionally.

codecraftroadmap

When I posted this on Twitter, a couple of people got in touch to say that they find it helpful, but also that a few of the disciplines were unfamiliar to them. So I thought it might be useful to go through them and summarise what they mean.

  • Foundations – the core enabling practices of code craft
    • Unit Testing – is writing fast-running automated tests to check the logic of our code, that we can run many times a day to ensure any changes we’ve made haven’t broken the software. We currently know of no other practical way of achieving this. Slow tests cause major bottlenecks in the development process, and tend to produce less reliable code that’s more expensive to maintain. Some folk say “unit testing” to mean “tests that check a single function, or a single module”. I mean “tests that have no external dependencies (e.g., a database) and run very fast”.
    • Version Control – is seat belts for programmers. The ability to go back to a previous working version of the code provides essential safety and frees us to be bolder with our code experiments. Version Control Systems these days also enable more effective collaboration between developers working on the same code base. I still occasionally see teams editing live code together, or even emailing source files to each other. That, my friends, is the hard way.
    • Evolutionary Development – is what fast-running unit tests and version control enable. It is one or more programmers and their customers collectively solving problems together through a series of rapid releases of a working solution, getting it less wrong with each pass based on real-world feedback. It is not teams incrementally munching their way through a feature list or any other kind of detailed plan. It’s all about the feedback, which is where we learn what works and what doesn’t. There are many takes on evolutionary development. Mine starts with a testable business goal, and ends with that goal being achieved. Yours should, too. Every release is an experiment, and experiments can fail. So the ability to revert to a previous version of the code is essential. Fast-running unit tests help keep changes to code safe and affordable. If we can’t change the code easily, evolution stalls. All of the practices of code craft are designed to enable rapid and sustained evolution of working software. In short, code craft means more throws of the dice.
  • Team Craft – how developers work together to deliver software
    • Pair Programming – is two programmers working side-by-side (figuratively speaking, because sometimes they might not even be on the same continent), writing code in real time as a single unit. One types the code – the “driver” – and one provides high-level directions – the “navigator”. When we’re driving, it’s easy to miss the bigger picture. Just like on a car journey, in the days before GPS navigation. The person at the wheel needs to be concentrating on the road, so a passenger reads the map and tells them where to go. The navigator also keeps an eye out for hazards the driver may have missed. In programming terms, that could be code quality problems, missing tests, and so on – things that could make the code harder to change later. In that sense, the navigator in a programming pair acts as a kind of quality gate, catching problems the driver may not have noticed. Studies show that pair programming produces better quality code, when it’s done effectively. It’s also a great way to share knowledge within a team. One pairing partner may know, for example, useful shortcuts in their editor that the other doesn’t. If members of a team pair with each other regularly, soon enough they’ll all know those shortcuts. Teams that pair tend to learn faster. That’s why pairing is an essential component of Codemanship training and coaching. But I appreciate that many teams view pairing as “two programmers doing the work of one”, and pair programming can be a tough sell to management. I see it a different way: for me, pair programming is two programmers avoiding the rework of seven.
    • Mob Programming – sometimes, especially in the early stages of development, we need to get the whole team on the same page. I’ve been using mob programming – where the team, or a section of it, all work together in real-time on the same code (typically around a big TV or projector screen) – for nearly 20 years. I’m a fan of how it can bring forward all those discussions and disagreements about design, about the team’s approach, and about the problem domain, airing all those issues early in the process. More recently, I’ve been encouraging teams to mob instead of having team meetings. There’s only so much we can iron out sitting around a table talking. Eventually, I like to see the code. It’s striking how often debates and misunderstandings evaporate when we actually look at the real code and try our ideas for real as a group. For me, the essence of mob programming is: don’t tell me, show me. And with more brains in the room, we greatly increase the odds that someone knows the answer. It’s telling that when we do team exercises on Codemanship workshops, the teams that mob tend to complete the exercises faster than the teams who work in parallel. And, like pair programming, mobbing accelerates team learning. If you have junior or trainee developers on your team, I seriously recommend regular mobbing as well as pairing.
  • Specification By Example – is using concrete examples to drive out a precise understanding of what the customer needs the software to do. It is practiced usually at two levels of abstraction: the system, and the internal high-level design of the code.
    • Test-Driven Development – is using tests (typically internal unit tests) to evolve the internal design of a system that satisfies an external (“customer”) test. It mandates discovery of internal design in small and very frequent feedback loops, making a few design decisions in each feedback loop. In each feedback loop, we start by writing a test that fails, which describes something we need the code to do that it currently doesn’t. Then we write the simplest solution that will pass that test. Then we review the code and make any necessary improvements – e.g. to remove some duplication, or make the code easier to understand – before moving on to the next failing test. One test at a time, we flesh out a design, discovering the internal logic and useful abstractions like methods/functions, classes/modules, interfaces and so on as we triangulate a working solution. TDD has multiple benefits that tend to make the investment in our tests worthwhile. For a start, if we only write code to pass tests, then at the end we will have all our solution code covered by fast-running tests. TDD produces high test assurance. Also, we’ve found that code that is test-driven tends to be simpler, lower in duplication and more modular. Indeed, TDD forces us to design our solutions in such a way that they are testable. Testable is synonymous with modular. Working in fast feedback loops means we tend to make fewer design decisions before getting feedback, and this tends to bring more focus to each decision. TDD, done well, promotes a form of continuous code review that few other techniques do. TDD also discourages us from writing code we don’t need, since all solution code is written to pass tests. It focuses us on the “what” instead of the “how”. Overly complex or redundant code is reduced. So, TDD tends to produce more reliable code (studies find up to 90% less bugs in production), that can be re-tested quickly, and that is simpler and more maintainable. It’s an effective way to achieve the frequent and sustained release cycles demanded by evolutionary development. We’ve yet to find a better way.
    • Behaviour-Driven Development – is working with the customer at the system level to precisely define not what the functions and modules inside do, but what the system does as a whole. Customer tests – tests we’ve agreed with our customer that describe system behaviour using real examples (e.g., for a £250,000 mortgage paid back over 25 years at 4% interest, the monthly payments should be exactly £1,290) – drive our internal design, telling us what the units in our “unit tests” need to do in order to deliver the system behaviour the customer desires. These tests say nothing about how the required outputs are calculated, and ideally make no mention of the system design itself, leaving the developers and UX folk to figure those design details out. They are purely logical tests, precisely capturing the domain logic involved in interactions with the system. The power of BDD and customer tests (sometimes called “acceptance tests”) is how using concrete examples can help us drive out a shared understanding of what exactly a requirement like “…and then the mortgage repayments are calculated” really means. Automating these tests to pull in the example data provided by our customer forces us to be 100% clear about what the test means, since a computer cannot interpret an ambiguous statement (yet). Customer tests provide an outer “wheel” that drives the inner wheel of unit tests and TDD. We may need to write a bunch of internal units to pass an external customer test, so that outer wheel will turn slower. But it’s important those wheels of BDD and TDD are directly connected. We only write solution code to pass unit tests, and we only write unit tests for logic needed to pass the customer test.
  • Code Quality – refers specifically to the properties of our code that make it easier or harder to change. As teams mature, their focus will often shift away from “making it work” to “making it easier to change, too”. This typically signals a growth in the maturity of the developers as code crafters.
    • Software Design Principles – address the underlying factors in code mechanics that can make code harder to change. On Codemanship courses, we teach two sets of design principles: Simple Design and Modular Design.
      • Simple Design
        • The code must work
        • The code must clearly reveal it’s intent (i.e., using module names, function names, variable names, constants and so on, to tell the story of what the code does)
        • The code must be low in duplication (unless that makes it harder to understand)
        • The code must be the simplest thing that will work
      • Modular Design (where a “module” could be a class, or component, or a service etc)
        • Modules should do one job
        • Modules should know as little about each other as possible
        • Module dependencies should be easy to swap
    • Refactoring – is the discipline of improving the internal design of our software without changing what it does. More bluntly, it’s making the code easier to change without breaking it. Like TDD, refactoring works in small feedback cycles. We perform a single refactoring – like renaming a class – and then we immediately re-run our tests to make sure we didn’t break anything. Then we do another refactoring (e.g., move that class into a different package) and test again. And then another refactoring, and test. And another, and test. And so on. As you can probably imagine, a good suite of fast-running automated tests is essential here. Refactoring and TDD work hand-in-hand: the tests make refactoring safer, and without a significant amount of refactoring, TDD becomes unsustainable. Working in these small, safe steps, a good developer can quite radically restructure the code whilst ensuring all along the way that the software still works. I was very tempted to put refactoring under Foundation, because it really is a foundational discipline for any kind of programming. But it requires a good “nose” for code quality, and it’s also an advanced skill to learn properly. So I’ve grouped it here under Code Quality. Developers need to learn to recognise code quality problems when they see them, and get hundreds of hours of practice at refactoring the code safely to eliminate them.
    • Legacy Code – is code that is in active use, and therefore probably needs to be updated and improved regularly, but is too expensive and risky to change. This is usually because the code lacks fast-running automated tests. To change legacy code safely, we need to get unit tests around the parts of the code we need to change. To achieve that, we usually need to refactor that code to make it easy to unit test – i.e., to remove external dependencies from that code. This takes discipline and care. But if every change to a legacy system started with these steps, over time the unit test coverage would rise and the internal design would become more and more modular, making changes progressively easier. Most developers are afraid to work on legacy code. But with a little extra discipline, they needn’t be. I actually find it very satisfying to rehabilitate software that’s become a millstone around our customers’ necks. Most code in operation today is legacy code.
    • Continuous Inspection – is how we catch code quality problems early, when they’re easier to fix. Like anything with the word “continuous” in the title, continuous inspection implies frequent automated checking of the code for cod quality “bugs” like functions that are too big or too complicated, modules with too many dependencies and so on. In traditional approaches, teams do code reviews to find these kinds of issues. For example, it’s popular these days to require a code review before a developer’s changes can be merged into the master branch of their repo. This creates bottlenecks in the delivery process, though. Code reviews performed by people looking at the code are a form of manual testing. You have to wait for someone to be available to do it, and it may take them some time to review all the changes you’ve made. More advanced teams have removed this bottleneck by automating some or all of their code reviews. It requires some investment to create an effective suite of code quality gates, but the pay-off in speeding up the check-in process usually more than pays for it. Teams doing continuous inspection tend to produce code of a significantly higher quality than teams doing manual code reviews.
  • Software Delivery – is all about how the code we write gets to the operational environment that requires it. We typically cover it in two stages: how does code get from the developer’s desktop into a shared repository of code that could be built, tested and released at any time? And how does that code get from the repository onto the end user’s smartphone, or the rented cloud servers, or the TV set-top box as a complete usable product?
    • Continuous Integration – is the practice of developers frequently (at least once a day) merging their changes into a shared repository from which the software can be built, tested and potentially deployed. Often seen as purely a technology issue – “we have a build server” – CI is actually a set of disciplines that the technology only enables if the team applies them. First, it implies that developers don’t go too long before merging their changes into the same branch – usually the master branch or “trunk”. Long-lived developer branches – often referred to as “feature branches” – that go unmerged for days prevent frequent merging of (and testing of merged) code, and is therefore most definitely not CI. The benefit of frequent tested merges is that we catch conflicts much earlier, and more frequent merges typically means less changes in each merge, therefore less merge conflicts overall. Teams working on long-lived branches often report being stuck in “merge hell” where, say, at the end of the week everyone in the team tries to merge large batches of conflicting changes. In CI, once a developer has merged their changes to the master-branch, the code in the repo is built and the tests are run to ensure none of those changes has “broken the build”. It also acts as a double-check that the changes work on a different machine (the build server), which reduces the risk of configuration mistakes. Another implication of CI – if our intent is to have a repository of code that can be deployed at any time – is that the code in master branch must always work. This means that developers need to check before they merge that the resulting merged code will work. Running a suite of good automated tests beforehand helps to ensure this. Teams who lack those tests – or who don’t run them because they take too long – tend to find that the code in their repo is permanently broken to some degree. In this case, releases will require a “stabilisation” phase to find the bugs and fix them. So the software can’t be released as soon as the customer wants.
    • Continuous Delivery – means ensuring that our software is always shippable. This encompasses a lot of disciplines. If the is code sitting on developers’ desktops or languishing in long-lived branches, we can’t ship it. If the code sitting in our repo is broken, we can’t ship it. If there’s no fast and reliable way to take the code in the repo and deploy it as a working end product to where it needs to go, we can’t ship it. As well as disciplines like TDD and CI, continuous delivery also requires a very significant investment in automating the delivery pipeline – automating builds, automating testing (and making those test run fast enough), automating code reviews, automating deployments, and so on. And these automated delivery processes need to be fast. If your builds take 3 hours – usually because the tests take so long to run – then that will slow down those all-important customer feedback loops, and slow down the process of learning from our releases and evolving a better design. Build times in particular are like the metabolism of your development process. If development has a slow metabolism, that can lead to all sorts of other problems. You’d be surprised how often I’ve seen teams with myriad difficulties watch those issues magically evaporate after we cut their build+test time down from hours to minutes.

Now, most of this stuff is known to most developers – or, at the very least, they know of them. The final two headings caused a few scratched heads. These are more advanced topics that I’ve found teams do need to think about, but usually after they’ve mastered the core disciplines that come before.

  • Managing Code Craft
    • The Case for Code Craft – acknowledges that code craft doesn’t exist in a vacuum, and shouldn’t be seen as an end in itself. We don’t write unit tests because, for example, we’re “professionals”. We write unit tests to make changing code easier and safer. I’ve found it helps enormously to both be clear in my own mind about why I’m doing these things, as well as in persuading teams that they should try them, too. I hear it from teams all the time: “We want to do TDD, but we’re not allowed”. I’ve never had that problem, and my ability to articulate why I’m doing TDD helps.
    • Code Craft Metrics – once you’ve made your case, you’ll need to back it up with hard data. Do the disciplines of code craft really speed up feedback cycles? Do they really reduce bug counts, and does that really save time and money? Do they really reduce the cost of changing code? Do they really help us to sustain the pace of innovation for longer? I’m amazed how few teams track these things. It’s very handy data to have when the boss comes a’knockin’ with their Micro-Manager hat on, ready to tell you how to do your job.
    • Scaling Code Craft – is all about how code craft on a team and within a development organisation just doesn’t magically happen overnight. There are lots of skills and ideas and tools involved, all of which need to be learned. And these are practical skills, like riding a bicycle. You can;t just read a book and go “Hey, I’m a test-driven developer now”. Nope. You’re just someone who knows in theory what TDD is. You’ve got to do TDD to learn TDD, and lot’s of it. And all that takes time. Most teams who fail to adopt code craft practices do so because they grossly underestimated how much time would be required to learn them. They approach it with such low “energy” that the code craft learning curve might as well be a wall. So I help organisations structure their learning, with a combination of reading, training and mentoring to get teams on the same page, and peer-based practice and learning. To scale that up, you need to be growing your own internal mentors. Ad hoc, “a bit here when it’s needed”, “a smigen there when we get a moment” simply doesn’t seem to work. You need to have a plan, and you need to invest. And however much you were thinking of investing, it’s not going to be enough.
  • High-Integrity Code Craft
    • Load-Bearing Code – is that portion of code that we find in almost any non-trivial software that is much more critical than the rest. That might be because it’s on an execution path for a critical feature, or because it’s a heavily reused piece of code that lies on many paths for many features. Most teams are not aware of where their load-bearing code is. Most teams don’t give it any thought. And this is where many of the horror stories attributed to bugs in software begin. Teams can improve at identifying load-bearing code, and at applying more exhaustive and rigorous testing techniques to achieve higher levels of assurance when needed. And before you say “Yeah, but none of our code is critical”, I’ll bet a shiny penny there’s a small percentage of your code that really, really, really needs to work. It’s there, lurking in most software, just waiting to send that embarrassing email to everyone in your address book.
    • Guided Inspection – is a powerful way of testing code by reading it. Many studies have shown that code inspections tend to find more bugs than any other kind of testing. In guided inspections, we step through our code line by line, reasoning about what it will do for a specific test case – effectively executing the code in our heads. This is, of course, labour-intensive, but we would typically only do it for load-bearing code, and only when that code itself has changed. If we discover new bugs in an inspection, we feed that back into an automated test that will catch the bug if it ever re-emerges, adding it to our suite of fast-running regression tests.
    • Design By Contract – is a technique for ensuring the correctness of the interactions between components of our system. Every interaction has a contract: a pre-condition that describes when a function or service can be used (e.g., you can only transfer money if your account has sufficient funds), and a post-condition that describes what that function or service should provide to the client (e.g., the money is deducted from your account and credited to the payee’s account). There are also invariants: things that must always be true if the software is working as required (e.g., your account never goes over it’s limit). Contracts are useful in two ways: for reasoning about the correct behaviour of functions and services, and for embedding expectations about that behaviour inside the code itself as assertions that will fail during testing if an expectation isn’t satisfied. We can test post-conditions using traditional unit tests, but in load-bearing code, teams have found it helpful to assert pre-conditions to ensure that not only do functions and services do what they’re supposed to, but they’re only ever called when they should be. DBC presents us with some useful conceptual tools, as well as programming techniques when we need them. It also paves the way to a much more exhaustive kind of automated testing, namely…
    • Property-Based Testing – sometimes referred to as generative testing, is a form of automated testing where the inputs to the tests themselves are programmatically calculated. For example, we might test that a numerical algorithm works for a range of inputs from 0…1000, at increments of 0.01. or we might test that a shipping calculation works for all combinations of inputs of country, weight class and mailing class. This is achieved by generalising the expected results in our tests, so instead of asserting that the square root of 4 is 2, we might assert that the square root of any positive number multiplied by itself is equal to the original number. These properties of correct test results look a lot like the contracts we might write when we practice Design By Contract, and therefore we might find experience in writing contracts helpful in building that kind of declarative style of asserting. The beauty of property-based tests is that they scale easily. Generating 1,000 random inputs and generating 10,000 random inputs requires a change of a single character in our test. One character, 9,000 extra test cases. Two additional characters (100,000) yields 99,000 more test cases. Property-based tests enable us to achieve quite mind-boggling levels of test assurance with relatively little extra test code, using tools most developers already know.

So there you have it: my code craft road map, in a nutshell. Many of these disciplines are covered in introductory – but practical – detail in the Codemanship TDD course book

If your team could use a hands-on introduction to code craft, our 3-day hands-on TDD course can give them a head-start.

The Return of Micro-Methods

Many moons ago, I experimented with what I call micro-methods for software development teams. A micro-method is a thought experiment that constrains teams with just three rules. Outside of those rules, teams can do whatever they want.

Here’s an example of a micro-method called “Hard Agile”:

  1. If any changes in the repo go unreleased for more than 7 days, the repo is reverted to the last release with no back up
  2. All local copies of code and all non-trunk branches are deleted daily with no back-ups
  3. All check-ins are mutation tested. If the mutation coverage is less than 100% (or as near as margins for error might allow, depending on the tool used), the check-in is rejected

The purpose of the thought experiment is to imagine what kind of behaviour might emerge from the team that strictly follows these three rules, but no others.

In the case of Hard Agile, teams can’t go for more than a week without releasing, or they lose all their unreleased changes. Developers can’t go for more than a day without merging to master, or they lose all their un-merged changes. And developers can’t check in code that can be broken without any automated tests catching it.

There’s nothing in here about requirements, or about code reviews, or about planning, or any of that stuff. The team would be left to figure that out for themselves.

But they would have no choice but to release the software at least once a week, so they’d be getting feedback from real end users frequently and – if they want the product to succeed – they’d need to act on that feedback.

They’d have no choice but to merge to master at least once a day, so more frequent releases would be practical and merge conflicts would be spotted earlier.

And they’d have no choice but to ensure all of their code is meaningfully tested with automated tests, so their releases will be more reliable and easier to change without breaking the software. They’d also probably need to follow design idioms that favour testability and need to make the software simpler because of the “test automation tax” they’ll have to pay for every line of solution code they want to add, so there’d be a likely side-benefit in design simplicity and modularity that studies have shown tend to result. We’ve tended to find that the optimum way to ensure all our code is tested is to only write solution code to pass tests. Sound familiar?

And, of course, a theoretical rules can be fitted with sliders. Maybe we can revert the repo if changes go unreleased for more than 3 days, or even one day? Maybe we can delete local copies and branches twice a day, or every hour?

Such a set-up, I reckon, might be useful for training teams – starting with nice, easy limits and gradually shrinking them to help ingrain these cadences until they become second nature.

Once a team’s “inner egg timer” is developed enough, we could remove the rules and see what behaviour emerges.