We need to talk about GitHub Copilot. This is the ML-driven programming tool – powered by OpenAI Codex – that Microsoft is promoting as “Your AI pair programmer”, and which they claim “works alongside you directly in your editor, suggesting whole lines or entire functions for you.”
Now, full disclaimer: I’ve not been able to try the Copilot Beta yet – there’s a waiting list – so my thoughts are based purely on what I’ve read about it, and what I’ve seen of it in demonstration videos by people who’ve tried.
At first glance, Copilot looks very impressive. You can, for example, just declare a descriptive function or method name, and it will suggest a matching implementation. Or you can write a comment about what you want the code to do, and it will generate it for you.
All the examples I’ve seen were for well-defined, self-contained problems – “calculate a square root”, “find the lowest number” and so on. I’ve yet to see it handle more complex problems like “send an SMS message to this number when a product is running low on stock”.
Copilot was trained on GitHub’s enormous wealth of other people’s code. This in itself is contentious, because when it autosuggests a solution, that might be your code that it’s reproducing without any license. Much has been made of the legality and the ethics of this in the tech press and on social media, so I don’t want to go into that here.
As someone who trains and coaches teams in code craft, though, I have other concerns about Copilot.
My chief concern is this: what Copilot does, to all intents and purposes, is copy and paste code off the Internet. As the developers of Copilot themselves admit:
GitHub Copilot doesn’t actually test the code it suggests, so the code may not even compile or run.https://copilot.github.com/
I warn teams constantly that copying and pasting code verbatim off the Internet is like eating food you found in a dumpster. You don’t know what’s in it. You don’t know where it’s been. You don’t know if it’s safe.
When we buy food in a store, or a restaurant, there are rules and regulations. The food, its ingredients, its preparation, its storage, its transportation are all subject to stringent checks to make sure as best we can that it will be safe to eat. In countries where the rules are more relaxed, incidents of food poisoning – including deaths – are much higher.
Code is like food. When we reuse code, we need to know if its safe. The ingredients (the code it reuses), its preparation and its delivery all need to go through stringent checks to make sure that it works. This is why we have a specific package design principle called the Reuse-Release Equivalency Principle – the unit of code reuse is the unit of code release. In other words, we should only reuse code that’s been through a proper, disciplines and predictable release process that includes sufficient testing and no further changes after that.
Maybe that Twinkie you fished out of the dumpster was safe when it left the store. But it’s been in a dumpster, and who knows where else, since then.
So my worry is that prolific use of a tool like Copilot will riddle production software – software that you and I consume – with potentially unsafe code.
My second concern is about understanding and – as a trainer and coach – about learning. I work with developers all the time who rely heavily on copying and pasting to solve problems in their code. Often, they’ll find an example of something in their own code base, and copy and paste it. Or they’ll find an example on the Web and copy and paste that. What I’ve noticed is that the developers who copy and paste a lot tend to pick things up slower – if ever.
I can buy a ready-made cake from Marks & Spencer, but that doesn’t make me a baker. I learn nothing about baking from that experience. No matter how many cakes I buy, I don’t get any better at baking.
Of course, when folk copy and paste code, they may change bits of it to suit their specific need. And that’s essentially what Copilot is doing – it’s not an exact copy of existing code. Well, you can also buy plain cake bases and decorate them yourself. But it still doesn’t make you a baker.
Some will argue “Oh, but Jason, you learned to program by copying code examples.” And they’d be right. But I copied them out of books and out of computing magazines. I had to read the code, and then type it in myself. The code had to go through my brain to get into the software.
Just like the code had to go through Copilot’s neural network to get into its repertoire. There’s perhaps an irony here that what Codex has done is automate the part where programmers learn.
So, my fear is that heavy use of Copilot could result in software that’s riddled with code that doesn’t necessarily work and that nobody on the team really understands. This is a restaurant where most of the food comes from dumpsters.
Putting aside other Copilot features I might take issue with (generating tests from implementation code? – shudder), I really feel that its a brilliant solution to completely the wrong problem. And I’m not the only who thinks this.
If we were to observe developers and measure where their time goes, how much of it is spent looking for code examples? How much of it is spent typing code? That’s a pie chart I’d like to see. What we do know from decades of experience is that developers spend most of their time trying to understand code – often code they wrote themselves. (Hands up. Who else hates Monday mornings?)
Copilot’s main selling point is like trying to optimise a database application that does 10 reads for every 1 write by making the writes faster.
Having the code pasted into your project for you doesn’t reduce this overhead. It’s someone else’s code. You have to read it and you have to understand it (and then, ideally, you have to test it.) It breaks the Reuse-Release Equivalency Principle. It’s not safe reuse.
And Copilot isn’t a safe pair programming partner, being as its only skill is fishing Twinkies out of the code dumpster of GitHub.
I think a lot of more experienced developers – especially those of us who’ve lived through both the promise of general A.I. (still 30 years away, no matter when you ask) and of Computer-Aided Software Engineering – have seen it all before in one form or another. We’re not going to lose any sleep over it.
The tagline for Copilot is “Don’t fly solo”, but anyone using it instead of programming with a real human is most definitely flying solo.
Wake me up when Copilot suggests removing the duplication its creating, instead of generating more of it.