Modular Systems Wear Their Dependencies On The Outside

Just a quick brain dump about dependency inversion, dependency injection and the “Russian dolls” style of object composition that a lot of developers ask me about.
Consider this piece of legacy code for a video rental application:

The pricing logic of our app uses IMDB ratings that it fetches from a web API. If we wanted to unit test this logic, or use a different source of video ratings, we’re stuck because the code to fetch the rating is embedded with the pricing logic.

We can extract the fetching code into its own method in its own class and inject an instance of it in the constructor of Pricer.

It’s now possible to substitute a different implementation of ImdbAPI (e.g., a stub, for unit testing) from outside Pricer.
But if we look one level up in our call stack, we see we still have a problem.

Our Rental class knows about Pricer and knows about ImdbAPI, so Rental is not unit-testable. We can fix this by injecting Pricer into Rental.

This is the “Russian dolls” style of composition I mentioned at the beginning. ImdbAPI is injected into Pricer, and Pricer is injected into Rental. So we swap the pricing logic without changing Rental, and we can swap the ratings source without changing Pricer, and every object in the chain only knows about the next object in the chain – and only its interface (method signatures). Notice how the object at the bottom of the call stack is created first.

Notice now that our highest-level module, Program.py, is “wearing” those dependencies. Program knows which implementations for pricing and fetching movie ratings are being used. Our core internal logic knows none of these details.
This is a natural consequence of dependency inversion and the “Russian dolls” style of composition – effective modular systems wear their dependencies on the outside.
The eagle-eyed among you will have noticed that Pricer has one concrete dependency, on Video.

But you should also notice that, while Pricer creates a Video, it doesn’t use the Video. This is very important: objects that reference other objects’s implementations shouldn’t call their methods, and objects that call other objects’ methods shouldn’t reference their implementations.

That’s a more essential form of dependency inversion. Any class that uses a Video‘s methods shouldn’t bind directly to an implementation. So we could stub Pricer to return any object that looks like a Video from the outside. This is only possible if Pricer is swappable.

Classes Start With Functions, Not Data

A common mistake developers make when designing classes is to start with a data model in mind and then try to attach functions to that data (e.g., a Zoo has a Keeper, who has a first name and a last name, etc). This data-centred view of classes tends to lead us towards anaemic models, where classes are nothing more than data containers and the logic that uses the data is distributed throughout the system. This lack of encapsulation creates huge amounts of low-level coupling.

Try instead to start with the function you need, and see what data it requires. This can be illustrated with a bit of TDD. In this example, we want to buy a CD. I start by writing the buy function, without any class to hang that on.

The parameters for buy() tell us what data this function needs. If we want to encapsulate some of that data, so that clients don’t need to know about all of them, we can introduce a parameter object to group related params.

This has greatly simplified the signature of the buy() function, and we can easily move buy() to the cd parameter.

Inside the new CompactDisc class…

We have a bunch of getters we don’t need any more. Let’s inline them.

Now, you may argue that you would have come up with this data model for a CD anyway. Maybe. But the point is that the data model is specifically there to support buying a CD.

When we start with the data, there’s a greater risk of ending up with the wrong data (e.g., many devs who try this exercise start by asking “What can we know about a CD?” and give it fields the functions don’t use), or with the right data in the wrong place – which is where we end up with Feature Envy and message chains and other coupling code smells galore.

Timeless Design Principles

In spare moments over the last couple of weeks, I’ve been preparing for a keynote I’ll be giving in May. My talk is titled Timeless Design Principles, and the thrust of it is to demonstrate how the software design principles I teach on my Codemanship courses could have been applied going back through the decades with the technology of the day.

This is partly a response to those developers working in shiny new languages who claim that these principles don’t apply to them, because “we work in JavaScript” or “we do functional programming”. I’ve been here before: in the 90s, when OOP was becoming mainstream, I would hear developers say “we don’t need to worry about modularity, because OOP is inherently modular.” LOLZ.

The fact of the matter is that no programming language forces you to write readable, simple, modular code that’s low in duplication. Sure, some make it easier than others. But not so much easier that we can just take our foot of the design-thinking gas and coast.

Let’s travel back through the decades to illustrate what I mean…

  • 2009 – Ten years ago, Ruby was very much the language du jour. The fashion was very much to build on the Ruby On Rails framework, which is very database-centric. A lot of Ruby code I saw back then had a distinctly 2-tier “transaction script” flavour to it. Not much going on in the way of true data encapsulation, with everything tightly bound to the database schema or the UI. This wasn’t the language’s fault, of course. As a dynamically-typed OO language, there’s nothing stopping us writing effectively modular code. There’s nothing not stopping us, either. Does Ruby S.O.L.I.D.? Sure it does.
  • 1999 – The reigning monarch of late 90’s programming languages – measured by how many people built software with it – was Visual Basic. In 1999, we were on to Visual Basic 6, which had some basic elements of OO. Again, the fashion was very much for database-centric GUI applications (and web applications) that had little in the way of effective modularity. To be fair, Microsoft’s own architecture advice kind of pushed developers in this direction, with “business logic” layers that marshalled database recordsets to the user interface and little else. But, again, you don’t have to do things this way. The secret modular source in VB6 was Microsoft’s Component Object Model (COM). Through the use of COM interfaces, we can hide data, make dependencies easily swappable and present client-specific interfaces.
  • 1989 – In the late 80s, C ruled the world. And, again, the fashion for programming/design style wasn’t very modular. But, also again, there’s no reason why you can’t write modular code – simple, single-purpose, interchangeable modules that hide their internal data/workings – in C. It has everything a language needs to S.O.L.I.D. Indeed, if you know how, C can in fact be as OO as most OO languages. Or as FP as most FP languages (especially GNU C).
  • 1979 – At the end of the 70s, Fortran was the most widely-used programming language in scientific and engineering computing. At university, I used Fortran 77. I was interested to see if it could S.O.L.I.D., and – you know what? – it kind of does. The dialect of F77 I used in this example differed from the language I remembered from university, though. Specifically, the F77 I wrote on Sun Sparc workstations supported data structures, which would have helped enormously with encapsulating data, in a C style. But, crucially, F77 had a basic form of dynamic dispatch, achieved by pasisng function references into functions and procedures, so a degree of swappability was possible.
  • 1969 – Going this far back, the choice of languages gets much smaller. And I’ll be the first to admit that there weren’t many choices that had the features I needed. But there were some. I chose the highly influential Simula 67 – the first recognisably object oriented programming language – to illustrate. The rather bizarre thing about this little demonstration is how, in many ways, Simula 67 was more advanced than programming languages that came decades late, like Visual Basic.
  • 1959 – Now we’re getting closer to the dawn of computer programming. The first 3GL, Fortran, only appeared in 1957. But was there really a programming language we could have applied the principles of Simple Design, Tell, Don’t Ask and S.O.L.I.D. in 60 years ago? One word: LISP.

And what of the present day? In 2019, the vast majority of programming languages in regular use have the features necessary to apply these fundamental design principles. Java, Python, C++, C#, F#, JavaScript, Clojure, Scala, Go, Swift, PHP, and most of all of the rest, to a larger degree.

What’s in vogue right now is functional programming. I hear devotees of FP say on a regular basis “Oh, FP is inherently S.O.L.I.D.” Oh, really?

Even in the purest of pure functional languages, the compiler won’t force you to write functions and modules that do only one job. It won’t force you to hide the data (indeed, some encourage us not to with data classes!) And it won’t force you to inject function dependencies to make them easily swappable. And it certainly won’t force you to write code that works, or write code that’s easy to understand, or write code that’s low in duplication and made from the simplest parts. We still have to think about all of these things in 2019Just as we did in 2009, 1999, 1989, 1979, 1969 and 1959. And I have very little doubt we’ll still need to think about them in 2029, 2039 and beyond.

S.O.L.I.D. Visual Basic?

In my journey back through the decades, investigating how the software design principles I teach on Codemanship courses could have been applied in programming languages of the day, I’ve visited 2009 (Ruby), 1989 (C), 1979 (Fortran 77) and 1969 (Simula 67), as well as a shiny new language from the present day (Kotlin) to bring us up to date.

For 1999, I’ve thought long and hard about what language to choose, and eventually settled on arguably the most popular at the time: Visual Basic. In that year, VB was in it’s last incarnation before the introduction of .NET. This was the height of the Microsoft COM era. Visual Basic 6 had some elements of object orientation, which were – in reality – built on COM. That is, VB6 “classes” were modules that had private implementations hidden behind public COM interfaces (a class with no fields at all still took up 96 bytes because… COM).

Here’s the carpet quote example done using VB6 class modules.

Room and Carpet are simple data classes, leading to the inevitable Feature Envy in CarpetQuote.

Also, the Quote() function does two distinct jobs: calculating the area of carpet required for a room and calculating the price of that fitted carpet. It knows too much.

Let’s break up the work…

Now let’s move those functions to where they belong.

Plus one for encapsulation, right? Well, not quite. Let’s take a look inside Room and Carpet.

I’d like to hide the data, and get rid of these property procedures. In a language like Java, I could pass the data in to a constructor and keep them private. But VB6 doesn’t support constructors, because COM components don’t support constructors. So I have to instantiate each class and then set their data from outside.

Is there a way to approximate a constructor in VB6 and keep the data hidden? Well, not really. In C, we could use a function in the same module as the data is defined to initialie a new Room or Carpet. VB6 doesn’t support static methods on classes, so a function to create an object could only be defined in a separate module, so the ability to set field values would have to be exposed. We could get rid of our getters, but not our setters.

Then our client can just use the CreateRoom() function to instantiate Room. It’s a little better, but – as with many languages – encapsulation can only really be achieved through discipline in writing client code, not actual language features. (This is just as true in, say, Python as in VB6.)

Now, how about swappability? Does VB6 support easy polymorphism? Remember that, in VB6, “classes” are really modules hidden behind COM interfaces. For sure, we can multiple classes that implement the same COM interface. So, if we wanted to have two different ways of calculating the area of a room, we could define a Room interface (as a .cls file)

…and then have different implementations of Room that know about the details.

CarpetQuote works with either implementation of room, and only sees the Room COM interface with the Area() method. This is how we hide details/data in Visual Basic 6 – using COM interfaces.

So that’s a tick for swappability, and a kind of tick – after a fashion – to encapsulation. Finally, can we make a VB6 class present client-specific interfaces?

Imagine we have a client that just needs to know how many flights of stairs carpet fitters will have to climb/descend to reach a room, so we can calculate a premium. Can we add another interface to an implementation of Room?

A VB6 class can implement more than one COM interface.

Our client binds only to the FloorLevel interface.

So, through the use of COM interfaces, that’s a tick for Interface Segregation.

Which means that – perhaps surprisingly – VB6 is 100% S.O.L.I.D.  Who’d have thunk?

 

You can view the complete source code at https://github.com/jasongorman/SOLID-VB6

Fortran 77 – Does It S.O.L.I.D.?

My journey through time to see how the software design principles I teach on Codemanship courses could have been applied in the past continues to 1979, 40 years ago. Although it was already 22 years old by then, Fortran was still one of the most popular languages – particularly in scientific and engineering computing.

The language had an upgrade in 1977, but was still recognisably the very procedural language that was first envisaged in 1957. I used Fortran 77 in my degree studies, and I last used it 27 years ago. How many of the design principles I recommend to developers could I have been applying in it?

Let’s start with the 4 principles of Simple Design:

  1. Should Fortran 77 code work? Well, I think so. Don’t you? In the early 1990s when I was writing code to do computational maths, I was a pretty naive programmer. I wrote no automated tests, and typically wrote all the code for a program – usually just 100-200 lines of it – before trying t to see if it gave the answers I expected. Can you automically test Fortran 77 code? Of course you can. If you can write code that calls code, you can write automated tests.
  2. Should Fortran 77 code clearly communicate its intent? Again, does anyone believe that readability doesn’t matter in Fortran 77? The language has all the mechanisms we need to endow our code with meaning – i.e., opportunities to name things.
  3. Should Fortran 77 code be free of duplication (unless that makes it harder to undersand)? The question is really one of language design: does Fortran 77 allow us to reuse code instead of repeating it? Yes, it does. Functions and subprocedures can encapsulate repeated code. And – in most implementations – code can be split into separate reusable “modules”. In the GNU77 version I used for this example, multiple “library” files can be separately compiled and linked to a main program, allowing for relatively easy reuse.
  4. Should Fortran 77 code be composed out of the simplest parts? It’s entirely possible to write Fortran 77 programs that are made out of functions and subprocedures that are very simple, if you choose to. Whether the language scales to very large composed programs of tens of thousands of lines of code is an interesting challenge. There are some limitations on naming in particular (Fortran 77 names aren’t case-sensitive, and cannot be qualified with namespaces or module names, so more thought is needed to prevent us running out of unique and meaningful names as we add more and more parts. But in 1979, this was less of a problem because of hardware limitations)

By and large, Simple Design is perfectly feasible in Fortran 77.

What about Tell, Don’t Ask? Let’s consider a familar example of a function that knows too much.

The dialect of Fortran 77 implemented for the GNU77 compiler doesn’t support data structures (other dialects, like the one I used at university running on Sun hardware did). So we have to pass in all of the data about the room and the carpet as individual parameters to calculate a quote. We are somewhat limited in what we can do as far as data encapsulation is concerned as a result.

If this was Fortran 90 or later (or Sun F77), we could have user-defined types for the room’s dimensions and the carpet’s pricing data, and we could encapsulate their creation in the same modules that access that data. In F90, we also have some control over visibility of module features. (See how similar language features enabled data encapsulation in C in a previous post.) So, if this was Fortran 90, things would be much easier.

In F77, we have a teeny bit of wiggle room to represent an “object” (e.g., a room) as a single entity that quote() doesn’t need to know the internal details of: we could represent the room’s dimensions as an array.

And our test client just passes in the room array.

It’s not so easy for the carpet pricing data. Fortran 77 arrays can only be of a single data type, so we can’t easily represent a real and boolean value in the same array without adding considerable complexity (e.g., a function for translating 0.0 an 1.0 into FALSE and TRUE). Is it worth it?

But it would be worth extracting a function in its own module for calculating the price of a carpet for an area of room, so that each module has a Single Responsibility.

So quote() can be simplied to:

Now, could we make these dependencies swappable? Well, surprisingly for this language, we can. Fortran 77 allows function references to be passed as parameters. So we could, for example, swap our area_of_room() function with a different implementation that has the same signature.

What if we also want to calculate the area of carpet required to fill a circular room?

If we add a function parameter to quote()…

…we can substitute this implementation by injection from the test program.

This is only possible if the data required by each implementation either doesn’t change – so the parameters stay the same – or can be encapsulated (e.g., in an array) as a single parameter. Fortran 77 has very little support for data encapsulation, and this limits the scope for swappability. So I give it 50% for swappability.

Finally, since Fortran 77 has no explicit concept of modules, interfaces and visibility, we can’t control what features are exposed to a client. We can control what features that client uses, of course. But in my G77 set-up, if I make any change to any module, all the modules that depend on it have to be recompiled, even if they don’t use the feature I changed.

So, to sum up: Fortran 77 ticks quite a few of my boxes and has some limited SOLID credentials, but it’s not quite there. Fortran 90 fixed some of these problems, with derived types, explicit modules and explicit interfaces, making it more like C in those respects.

I give Fortran 77 6.5/10 for ease of applying these design principles.

 

You can find the complete source code at https://github.com/jasongorman/fortran-77-SOLID

 

 

 

S.O.L.I.D. in 1969?

For a talk I’m preparing on “Timeless Design Principles”, I thought it would be fun to journey back through the decades and demonstrate how the design principles I teach on the Codemanship courses could have been applied with the technology available at the time.

I’ve already show how it’s possible in C, so that takes us back to the 1989 – 30 years ago. But what if we travel back another 20 years to 1969: could we have written SOLID code then?

The short answers is “Yes, but…” And I’ll get to the “but…” at the end. For now, suffice to say that programming languages did exist in the late 1960s that enable SOLID code. The first object oriented language is Simula 67.

Here’s our now-familiar carpet quote example in Simula 67.

This CarpetQuote class takes two constructor parameters for a Room and a Carpet. They are simple data classes (“records”).

What we’ve got here is a classic case of a method that does more than one thing, as well as Feature Envy for the fields of Room and Carpet. Let’s fix that by extracting methods for calculating room area and carpet price, and moving them to where the data is.

So now CarpetQuote knows nothing about the details of how these calculations are done, and is greatly simplified into a composed method.

So that the S in SOLID, and Tell, Don’t Ask ticked off our list of modular design principles. What about swappability (the O, L and D in SOLID)?

The designers of Simula 67 provided a simple mechanism for this. If we declare them as virtual in a base class.

And we can implement them in subclasses.

CarpetQuote binds to the abstract types Room and Carpet still, so if we wanted to – for example – swap in a circular room, it’s a doddle.

Our test client can now pass in whichever shape of room it chooses, with no need to change CarpetQuote.

So we’ve ticked the Tell, Don’t Ask box, and the S, O, L and D boxes. What about Interface Segregation? Well, Simula 67 offers very limited support to achieve client-specific interfaces. We can hide methods a client doesn’t need to see using base classes that only declare those methods, but it’s a one-interface-per-class deal as Simula 67 doesn’t support multiple inheritance.

Having said that, if our classes only do one job, then I suspect the need for multiple interfaces to support multiple clients would be quite limited.

On the whole, it’s good news for our design principles in Simula 67. Now for the bad news…

The observant among you may have noticed that all of these code gists have the same file name. As far as I’ve been able to learn, with very limited – and often conflicting – documentation for a language I don’t think more than a handful of people have used in decades, the GNU Cim Simula compiler only accepts a single .sim source file. So true modularity isn’t possible with the tools available today.

I suspect in 1967 that – with very limited computing power – Simula programs didn’t get so big that they necessarily needed to be split multiple source files. There is some limited support for a kind of modularity (classes inside classes or “packages”), but that seems to be in the same single file only.

With more time and bit of work, someone could probably knock up a simple inline #INCLUDE pre-processor that could pull in code from other files, but that’s beyond the scope of my mini-adventure.

This has been fun and educational, though. I’d read about Simula but never tried it. Throwing together this little program was hard work, but seeing a 52-yeear-old language resurrected on my Windows laptop was rewarding – like hearing the engine of a classic car revving into life after it’s been rusting in someone’s garage since the 1980s.

If you’d like to have a go at some Simula yourself, here are a few resources to get you started:

  • GNU Cim Simula compiler (on modern Windows, follow the instructions for Win NT) – generates C code and then compiles with GCC
  • Introduction to Simula 67 – warning: the code in this guide almost certainly was not actually run on a computer
  • Simula 67 grammar – you’re going to need this, because almost all the guides I found online are not correct Simula 67

And you can find the complete source listing here.

Without syntax-directed support in my editor, and with the often not-very-helpful guides and compiler error messages, this little exercise involved a certain amount of trial and error to figure out exactly what the syntax is. In that sense, it was quite a nostalgic trip down memory lane. This is how things were 30+ years ago: learning to code from printed listings and misleading documentation by trial and error. It’s also the reason why I never publish code that I haven’t seen compile and run correctly. Academics: I’m looking at you!

 

 

 

S.O.L.I.D. in Kotlin

My journey of demonstrating how S.O.L.I.D. design principles can be applied in a range of programming languages going back 50+ years gets bang up-to-date with an example in Kotlin.

Now, you could probably argue that Kotlin is a no-brainer where this is concerned. Anything I can do in Java I can do in Kotlin, if I choose to. Kotlin has classes, interfaces and constructors. We can make data private just as easily as in Java. But still, I hear objections from developers doing pure FP in Kotlin that either:

a. “OO” design principles don’t apply (which is why I’ve stopped calling them that – they’re modular design principles), or…

b. We don’t need to apply S.O.L.I.D. to functional programming, because FP is innately S.O.L.I.D. (Spoiler Alert: it isn’t.)

Whereas with older languages like C and Fortran 77, I’m working harder to get around some language limitations, with languages like Kotlin and Clojure, I’m having to work harder to get around cultural limitations. To be fair, this is not a new phenomenon. I can clearly recall programmers telling me – in the heydey of OOP in the mid-to-late 90s – that you didn’t need to think about things like modularity because OOP is innately modular. (Spoiler Alert: it wasn’t.) Give a C programmer C++ and they’ll write you procedural C++ code. And, as my previous post illustrated, give a C++ programmer C and they’ll find a way to create objects with it.

I define code that’s effectively modular by three key properties:

  • It’s made of discrete parts that do one job each
  • Those parts know as little as possible about each other
  • Those parts are easily swappable

There’s no programming language on Earth that forces us to write code that ticks all three boxes. You have to tick the boxes yourself by the design choices you make.

Granted, there are things we need from a programming language to enabe effective modularity:

  • The ability to break code up into discrete reusable units (i.e., modules)
  • The ability to control what client code can see of a module (or – in the case of Ruby, Python, JS etc – the ability to make that not matter with dynamic binding)
  • The ability to dynamically substitute a different implementation without re-writing the client code

These days, the vast majority of programming languages available to us score 3/3. There are some older languages that offer no mechanism for polymorphism, but you’re very probably not using one of them on a regular basis. You can even do it to a limited extent in Fortran 77. Any language that allows us to pass a function or procedure pointer/reference as a parameter is technically polymorphic.

Anyhoo, here we are in 2019, and the shiny new kid on the code block is JetBrain’s Kotlin. It’s a derivative of Java, with spiffy FP sensibilities. To an old Java hand like me, it takes no time at all to learn. Here’s the carpet quote example in Kotlin.

Again, the first problem that leaps out at me is that this function is doing more than one thing. It breaks the Single Responsibility principle. Let’s refactor each reason to change into its own function in its own module.

The next thing that’s bothering me is that our data classes Room and Carpet are unencapsulated. That’s always bugs me. In OO design, we say that data classes are a code smell. They hurt us in FP, too. A dependency’s a dependency. Let’s refactor our area() and price() functions into closures that hide the data from quote().

And yes, I would just as readily use a class instead of a closure. I’m not an FP purist.

This refactoring has killed two birds with one stone: we’re hiding the data from quote() and now we can easily swap in a different implementation for calculating room area and carpet price without changing quote().

For example, what about a circular room?

That ticks the O, the L and the D in S.O.L.I.D. – our dependencies are now easily swappable. So, so far, we’ve covered S.O.L.D. as well as Tell, Don’t Ask.

What about Interface Segregation? Well, unlike many languages that support FP, Kotlin also has direct support for classes that implement multiple client-specific interfaces. If we can do it in Java, we can do it in Kotlin.

Tick.

 

You can view the source files at https://github.com/jasongorman/kotlin_solid