VOOZH about

URL: https://thenewstack.io/this-week-in-programming-github-copilot-copyright-infringement-and-open-source-licensing/

⇱ This Week in Programming: GitHub Copilot, Copyright Infringement and Open Source Licensing - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2021-07-03 06:00:21
This Week in Programming: GitHub Copilot, Copyright Infringement and Open Source Licensing
this-week-in-programming,
Open Source / Software Development

This Week in Programming: GitHub Copilot, Copyright Infringement and Open Source Licensing

GitHub's new feature is "your AI pair programmer" but might also be appropriately called "IntelliSense on steroids."
Jul 3rd, 2021 6:00am by Mike Melanson
👁 Featued image for: This Week in Programming: GitHub Copilot, Copyright Infringement and Open Source Licensing
Feature photo by Westwind Air Service on Unsplash.

Earlier this week, GitHub introduced GitHub Copilot, a new feature that it is referring to as “your AI pair programmer” but might also be appropriately called “IntelliSense on steroids.” Built using OpenAI Codex, a new system that the company says is “significantly more capable than GPT-3 in code generation,” the tool not only autocompletes lines of code but will offer entire blocks of code in response to both code that you type and natural language.

Having been “trained on billions of lines of public code,” one of the first questions that has come up regarding Copilot has focused on issues of copyright, specifically pointing to the idea of the viral GPL license, which requires that all derivative works carry that same license.

copyright does not only cover copying and pasting; it covers derivative works. github copilot was trained on open source code and the sum total of everything it knows was drawn from that code. there is no possible interpretation of “derivative” that does not include this

— eevee (@eevee) June 30, 2021

Now, while there is plenty of conversation floating around on Twitter and a few Hacker News threads, most of it, as you might expect, falls under the “I am not a lawyer” disclaimer. There is one Hacker News comment, from GitHub CEO Nat Friedman, however, that offers a bit of a response to questions along these same lines.

“In general,” writes Friedman, “(1) training ML systems on public data is fair use (2) the output belongs to the operator, just like with a compiler.” He then offers a link to OpenAI’s position on training machine learning models, which argues that “training AI systems constitutes fair use” and furthermore that “policy considerations underlying fair use doctrine support the finding that training AI systems constitute fair use.”

Well, of course, we thought you might say something like that, Nat.

But Friedman is not alone — a couple of actual lawyers and experts in intellectual property law took up the issue and, at least in their preliminary analysis, tended to agree with Friedman. First, Neil Brown examines the idea from an English law perspective and, while he’s not so sure about the idea of “fair use” if the idea is taken outside of the U.S., he points simply to GitHub’s terms of service as evidence enough that the company can likely do what it’s doing. Brown points to passage D4, which grants GitHub “the right to store, archive, parse, and display Your Content, and make incidental copies, as necessary to provide the Service, including improving the Service over time.”

“The license is broadly worded, and I’m confident that there is scope for argument, but if it turns out that Github does not require a license for its activities then, in respect of the code hosted on Github, I suspect it could make a reasonable case that the mandatory license grant in its terms covers this as against the uploader,” writes Brown. Overall, though, Brown says that he has “more questions than answers.”

I’ve seen the source code for this. I remember something along the lines of pic.twitter.com/vVRSlUSU2e

— Tomáš Rottenberg (@hacksparr0w) June 29, 2021

In a more definitive take, Andres Guadamuz, a senior lecturer in intellectual property law at the University of Sussex and the Editor in Chief of the Journal of World Intellectual Property, takes up the question of whether or not GitHub Copilot is infringing copyright, concluding that “this is neither copyright infringement nor license breach, but I’m happy to be convinced of the contrary.”

On the idea of copyright infringement, Guadamuz first points to a research paper by Alber Ziegler published by GitHub, which looks at situations where Copilot reproduces exact texts, and finds those instances to be exceedingly rare. In the original paper, Ziegler notes that “when a suggestion contains snippets copied from the training set, the UI should simply tell you where it’s quoted from,” as a solution against infringement claims.

On the idea of the GPL license and “derivative” works, Guadamuz again disagrees, arguing that the issue at hand comes down to how the GPL defines modified works, and that “derivation, modification, or adaptation (depending on your jurisdiction) has a specific meaning within the law and the license.”

“You only need to comply with the license if you modify the work, and this is done only if your code is based on the original to the extent that it would require a copyright permission, otherwise it would not require a license,” writes Guadamuz. “As I have explained, I find it extremely unlikely that similar code copied in this manner would meet the threshold of copyright infringement, there is not enough code copied, and even if there is, it appears to be mostly very basic code that is common to other projects.”

While Copilot definitely appears to spit out verbatim code once in a while, it is the infrequency of that occurrence that seems to assure Guadamuz that the tool is in little jeopardy for being successfully litigated against. In one comment on his article, he writes that “this is all going to be solved eventually by Codex an Copilot offering a similarity tool where programmers can check whether there is any recitation in their code,” which might help with scenarios such as this:

I don’t want to say anything but that’s not the right license Mr Copilot. pic.twitter.com/hs8JRVQ7xJ

— Armin Ronacher (@mitsuhiko) July 2, 2021

And while we’re here, if copyright infringement and open source licensing is less of a concern for you, and you’re more interested in just how cool and useful a tool like GitHub Copilot might be, make sure to head on over and read Darryl Taft’s analysis of Copilot, which he calls “A Powerful, Controversial Autocomplete for Developers“.

This Week in Programming

  • “Uncomfortable Questions” on Windows’ Android Apps: Last week, we wrote about how Microsoft was yet again causing the netherworld to plunge below freezing with its addition of Android apps to Windows. Something about Microsoft’s use of the Amazon AppStore for Android, and not Google’s Play Store, felt a bit… suspicious… and Android developer, author, and developer Mark Murphy points out why in his blog post on “Windows 11, Amazon, and Uncomfortable Questions“. In his post, Murphy writes that “Amazon pioneered the ‘replace the developer signature’ approach that Google uses with App Signing. And, Amazon does so specifically to be able to modify every Android app that they distribute,” pointing out that this might go beyond collecting analytics, which is troublesome enough, but to also perhaps modify apps to bypass end-to-end encryption, for example. Uncomfortable questions, indeed.

hope this helps someone re: infinidash #i8h pic.twitter.com/fmzLt6WH5b

— ln körbes (@ellenkorbes) July 2, 2021

  • Docker Desktop 3.5: The latest version of Docker Desktop, Docker Desktop 3.5, has arrived with improved volume management, Docker Dev Environments and more. In addition to the Dev Environments, which we reviewed last week, Docker Desktop 3.5 gives users the ability to better manage files in their volumes, and continues the roll-out of Docker Compose V2 Beta, which brings the compose command to the Docker CLI. In addition to all of that, Docker Desktop 3.5 also adds a warning for images that are incompatible with Apple Silicon machines, as well as takes a hint from user feedback, making requests for feedback just a little “less disruptive”. To check it all out, download or update to Docker Desktop 3.5.
  • Quarkus 2.0 Drops: Red Hat has launched Quarkus 2.0, highlighting amongst the new features continuous testing, DevServices, a new Developer UI and Developer CLI, as well as improved performance with “lighting-fast” RESTEasy Reactive. Quarkus is Red Hat’s Kubernetes Native Java framework tailored for OpenJDK HotSpot and GraalVM, or what it refers to as “Supersonic Subatomic Java”, and this move to 2.0 “signals a new level of maturity for the project,” they write. To find out all that’s new, head on over to the Quarkus 2.0 launch site or to code.quarkus.io to give it a try.

These Github Copilot tweets just write themselves

— willman (@willmanduffy) June 29, 2021

TRENDING STORIES
Mike is a freelance writer, editor, and all-around techie wordsmith. Mike has written for publications such as ReadWriteWeb, Venturebeat, and ProgrammableWeb. His first computer was a "portable" suitcase Compaq and he remembers 1200 baud quite clearly.
Read more from Mike Melanson
SHARE THIS STORY
TRENDING STORIES
TNS owner Insight Partners is an investor in: Docker, OpenAI.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.