One of the topics we cover on the Codemanship TDD course is one that developers raise often: how can we write fast-running unit tests for code that’s not easily unit-testable? Most developers are working on legacy code – code that’s difficult and risky to change – most of the time. So it’s odd there’s only one book about it.
I highly recommend Micheal Feather’s book for any developer working in any technology applying any approach to development. On the TDD course, I summarise what we mean by “legacy code” – code that doesn’t have fast-running automated tests, making it risky to change – and briefly demonstrate Michael’s process for changing legacy code safely.
The example I use is a simple Python program for pricing movie rentals based on their IMDB ratings. Average movies rentals cost £3.95. High-rated movies cost an extra pound, low-rated movies cost a pound less.
My program has no automated tests, so I’ve been testing it manually using the command line.
Suppose the business asked us to change the pricing logic; how could we do this safely if we lack automated tests to guard against breaking the code?
Michael’s process goes like this:
- Identify what code you will need to change
- Identify where around that code you’d want unit tests
- Break any dependencies that are stopping you from writing unit tests
- Write the unit tests you’d want to satisfy you the changes you’ll make didn’t break the code
- Make the change
- While you’re there, refactor to improve the code that’s now covered by unit tests to make life easier for the next person who changes it (which could be you)
My Python program has a class called Pricer which we’ll need to change to update the pricing logic.
|
class Pricer(object): |
|
|
|
def price(self, imdbID): |
|
|
|
response = requests.get("http://www.omdbapi.com/?i=" + imdbID + "&apikey=6487ec62") |
|
|
|
json = response.json() |
|
|
|
title = json["Title"] |
|
rating = float(json["imdbRating"]) |
|
|
|
price = 3.95 |
|
|
|
if rating > 7: |
|
price += 1.0 |
|
|
|
if rating < 4: |
|
price -= 1.0 |
|
|
|
return Video(imdbID, title, rating, price) |
I’ve been testing this logic one level above by testing the Rental class that uses Pricer.
|
class Rental(object): |
|
|
|
def __init__(self, customer, imdbID): |
|
self.customer = customer |
|
self.video = Pricer().price(imdbID) |
|
|
|
def __str__(self): |
|
return "Video Rental – customer: " + self.customer \ |
|
+ ". Video => title: " + self.video.title \ |
|
+ ", price: £" + str(self.video.price) |
My script that I’ve been manually testing with allows me to create Rental objects and write their data to the command line for different movies using their IMDB ID’s.
|
def main(): |
|
customer = sys.argv[1] |
|
imdbID = sys.argv[2] |
|
|
|
rental = Rental(customer, imdbID) |
|
|
|
print(rental) |
|
|
|
if __name__ == "__main__": main() |
|
|
|
# |
|
# Example movies: |
|
# |
|
# tt0096754 – high rated |
|
# tt0060666 – low rated |
|
# tt0317303 – medium rated |
I use three example movies – one with a high rating, one low-rated and one medium-rated – to test the code. For example, the output for the high-rated movie looks like this.
C:\Users\User\Desktop\tdd 2.0\python_legacy>python program.py jgorman tt0096754
Video Rental – customer: jgorman. Video => title: The Abyss, price: £4.95
I’d like to reproduce these manual tests as unit tests, so I’ll be writing unittest tests for the Rental class for each kind of movie.
But before I can do that, there’s an external dependency we have to deal with. The Pricer class connects directly to the OMDB API that provides movie information. I want to stub that so I can provide test IMDB ratings without connecting.
Here’s where we have to get disciplined. I want to refactor the code to make it unit-testable, but it’s risky to do that because… there’s no unit tests! Opinions differ on approach, but personally – learned through bitter experience – I’ve found that it’s still important to re-test the code after every refactoring, manually if need be. It will seem like a drag, but we all tend to overlook how much time we waste downstream fixing avoidable bugs. It will seem slower to manually re-test, but it’s often actually faster in the final reckoning.
Okay, let’s do a refactoring. First, let’s get that external dependency in its own method.
|
class Pricer(object): |
|
|
|
def price(self, imdbID): |
|
|
|
rating, title = self.fetch_video_info(imdbID) |
|
|
|
price = 3.95 |
|
|
|
if rating > 7: |
|
price += 1.0 |
|
|
|
if rating < 4: |
|
price -= 1.0 |
|
|
|
return Video(imdbID, title, rating, price) |
|
|
|
def fetch_video_info(self, imdbID): |
|
response = requests.get("http://www.omdbapi.com/?i=" + imdbID + "&apikey=6487ec62") |
|
json = response.json() |
|
title = json["Title"] |
|
rating = float(json["imdbRating"]) |
|
return rating, title |
I re-run my manual tests. Still passing. So far, so good.
Next, let’s move that new method into its own class.
|
class Pricer(object): |
|
|
|
def price(self, imdbID): |
|
|
|
rating, title = VideoInfo().fetch_video_info(imdbID) |
|
|
|
price = 3.95 |
|
|
|
if rating > 7: |
|
price += 1.0 |
|
|
|
if rating < 4: |
|
price -= 1.0 |
|
|
|
return Video(imdbID, title, rating, price) |
|
|
|
|
|
class VideoInfo: |
|
def fetch_video_info(self, imdbID): |
|
response = requests.get("http://www.omdbapi.com/?i=" + imdbID + "&apikey=6487ec62") |
|
json = response.json() |
|
title = json["Title"] |
|
rating = float(json["imdbRating"]) |
|
return rating, title |
And re-test. All passing.
To make the dependency on VideoInfo swappable, the instance needs to be injected into the constructor of Pricer from Rental.
|
class Pricer(object): |
|
def __init__(self, videoInfo): |
|
self.videoInfo = videoInfo |
|
|
|
def price(self, imdbID): |
|
|
|
rating, title = self.videoInfo.fetch_video_info(imdbID) |
|
|
|
price = 3.95 |
|
|
|
if rating > 7: |
|
price += 1.0 |
|
|
|
if rating < 4: |
|
price -= 1.0 |
|
|
|
return Video(imdbID, title, rating, price) |
|
class Rental(object): |
|
|
|
def __init__(self, customer, imdbID): |
|
self.customer = customer |
|
self.video = Pricer(VideoInfo()).price(imdbID) |
|
|
|
def __str__(self): |
|
return "Video Rental – customer: " + self.customer \ |
|
+ ". Video => title: " + self.video.title \ |
|
+ ", price: £" + str(self.video.price) |
And re-test. All passing.
Next, we need to inject the Pricer into Rental, so we can stub VideoInfo in our planned unit tests.
|
class Rental(object): |
|
|
|
def __init__(self, customer, imdbID, pricer): |
|
self.customer = customer |
|
self.video = pricer.price(imdbID) |
|
|
|
def __str__(self): |
|
return "Video Rental – customer: " + self.customer \ |
|
+ ". Video => title: " + self.video.title \ |
|
+ ", price: £" + str(self.video.price) |
And re-test. All passing.
Now we can write unit tests to replicate our command line tests.
|
class VideoInfoStub(VideoInfo): |
|
def __init__(self, rating): |
|
self.rating = rating |
|
self.title = 'XXXXX' |
|
|
|
def fetch_video_info(self, imdbID): |
|
return self.rating, self.title |
|
|
|
|
|
class RentalTest(unittest.TestCase): |
|
def test_movie_title_retrieved(self): |
|
rental = Rental('jgorman', 'tt9999', Pricer(VideoInfoStub(7.1))) |
|
self.assertEqual('XXXXX', rental.video.title) |
|
|
|
def test_customer_id_recorded(self): |
|
rental = Rental('jgorman', 'tt9999', Pricer(VideoInfoStub(7.1))) |
|
self.assertEqual('jgorman', rental.customer) |
|
|
|
def test_high_rated_movie_price(self): |
|
rental = Rental('jgorman', 'tt9999', Pricer(VideoInfoStub(7.1))) |
|
self.assertEqual(4.95, rental.video.price) |
|
|
|
def test_low_rated_movie_price(self): |
|
rental = Rental('jgorman', 'tt9999', Pricer(VideoInfoStub(3.9))) |
|
self.assertEqual(2.95, rental.video.price) |
|
|
|
def test_medium_rated_movie_price(self): |
|
rental = Rental('jgorman', 'tt9999', Pricer(VideoInfoStub(6.9))) |
|
self.assertEqual(3.95, rental.video.price) |
These unit tests reproduce all the checks I was doing visually at the command line, but they run in a fraction of a second. The going get’s much easier from here.
Now I can make the change to the pricing logic the business requested.

We can tackle this in a test-driven way now. Let’s update the relevant unit test so that it now fails.
|
def test_high_rated_movie_price(self): |
|
rental = Rental('jgorman', 'tt9999', Pricer(VideoInfoStub(7.1))) |
|
self.assertEqual(5.95, rental.video.price) |
Now let’s make it pass.
|
def price(self, imdbID): |
|
|
|
rating, title = self.videoInfo.fetch_video_info(imdbID) |
|
|
|
price = 3.95 |
|
|
|
if rating > 7: |
|
price += 2.0 |
|
|
|
if rating < 4: |
|
price -= 1.0 |
|
|
|
return Video(imdbID, title, rating, price) |
(And, yes – obviously in a real product, the change would likely be more complex than this.)
Okay, so we’ve made the change, and we can be confident we haven’t broken the software. We’ve also added some test coverage and dealt with a problematic dependency in our architecture. If we wanted to get movie ratings from somewhere else (e.g., Rotten Tomatoes), or even aggregate sources, it would be quite straightforward now that we’ve cleanly separated that concern from our business logic.
One last thing while we’re here: there’s a couple of things in this code that have been bugging me. Firstly, we’ve been mixing our terminology: the customer says “movie”, but our code says “video”. Let’s make our code speak the customer’s language.
|
class Pricer(object): |
|
def __init__(self, movieInfo): |
|
self.movieInfo = movieInfo |
|
|
|
def price(self, imdbID): |
|
|
|
rating, title = self.movieInfo.fetch_movie_info(imdbID) |
|
|
|
price = 3.95 |
|
|
|
if rating > 7: |
|
price += 2.0 |
|
|
|
if rating < 4: |
|
price -= 1.0 |
|
|
|
return Movie(imdbID, title, rating, price) |
Secondly, I’m not happy with clients accessing objects’ fields directly. Let’s encapsulate.
|
class Rental(object): |
|
|
|
def __init__(self, customer, imdbID, pricer): |
|
self._customer = customer |
|
self._movie = pricer.price(imdbID) |
|
|
|
def __str__(self): |
|
return "Movie Rental – customer: " + self._customer \ |
|
+ ". Movie => title: " + self._movie.get_title() \ |
|
+ ", price: £" + str(self._movie.get_price()) |
|
|
|
def get_customer(self): |
|
return self._customer |
|
|
|
def get_movie_price(self): |
|
return self._movie.get_price() |
|
|
|
def get_movie_title(self): |
|
return self._movie.get_title() |
With our added unit tests, these extra refactorings were much easier to do, and hopefully that means that changing this code in the future will be much easier, too.
Over time, one change at a time, the unit test coverage will build up and the code will get easier to change. Applying this process over weeks, months and years, I’ve seen some horrifically rigid and brittle software products – so expensive and risky to change that the business had stopped asking – be rehabilitated and become going concerns again.
By focusing our efforts on changes our customer wants, we’re less likely to run into a situation where writing unit tests and refactoring gets vetoed by our managers. The results of highly visible “refactoring sprints”, or even long refactoring phases – I’ve known clients freeze requirements for up to a year to “refactor” legacy code – are typically disappointing, and run the risk of making refactoring and adding unit tests forbidden by disgruntled bosses.
One final piece of advice: never, ever discuss this process with non-technical stakeholders. If you’re asked to break down an estimate to change legacy code, resist. My experience has been that it often doesn’t take any longer to make the change safely, and the longer-term benefits are obvious. Don’t give your manager or your customer the opportunity to shoot themselves in the foot by offering up unit tests and refactoring as a line item. Chances are, they’ll say “no, thanks”. And that’s in nobody’s interests.