VOOZH about

URL: https://thenewstack.io/open-source-builders-why-data-scientists-love-matplotlib/

⇱ Open Source Builders: Why Data Scientists Love Matplotlib - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2020-05-26 13:00:39
Open Source Builders: Why Data Scientists Love Matplotlib
contributed,sponsor-amazon-web-services-aws,sponsored,sponsored-post-contributed,
Open Source / Software Development

Open Source Builders: Why Data Scientists Love Matplotlib

Matplotlib is an essential Python 2D plotting library for data visualization. We interview the project lead, Thomas Caswell.
May 26th, 2020 1:00pm by Matt Asay
👁 Featued image for: Open Source Builders: Why Data Scientists Love Matplotlib
AWS sponsored this post.
This is part of a series on Open Source Builders. For a list of other articles in this series, check out the introductory post.

Amazon Web Services (AWS) sponsored this post.

Matt Asay
Matt is a principal at AWS and has been involved in open source and all that it enables (cloud, machine learning, data infrastructure, mobile, etc.) for nearly two decades, working for a variety of open source companies and writing regularly for InfoWorld and TechRepublic. You can follow him on Twitter (@mjasay).

No matter how many thousands of large data sets you may be crunching with TensorFlow, or how much you use PyTorch to accelerate tensor computation with GPUs, at some point you’ll want to represent your results with cross-platform charts and figures. And for that, you’re almost certainly going to want to get to know Matplotlib, an essential Python 2D plotting library for data visualization. Though Matplotlib is beloved by data scientists, its roots are in physical science, oceanography, and climatology. Data science folks came later, borrowed the core libraries, and have applied them to more corporate uses.

Though today there is an ever-growing universe of Python-based data science tools and libraries, for years Matplotlib was the only way to make plots in Python; and it remains the default. At the heart of the Matplotlib development community is project lead Thomas Caswell, who found his way to leadership almost by accident as he went from answering Matplotlib questions on Stack Overflow, to submitting bug fixes to authoring patches.

In a recent web conference, Caswell walked me through his journey to Matplotlib leadership and why he contributes.

A Contribution Evolution

Caswell wasn’t the founder of Matplotlib — that honor goes to John Hunter, an epilepsy researcher at the University of Chicago Medical Center in the early 2000s. Hunter grew tired of fighting for access to the hardware key dongle that allowed him to use a proprietary software program for doing electrocorticography analysis. Hunter first tried to replace this program with MATLAB but found it unsuitable for his needs, so he set out to build what became Matplotlib.

During this time, Caswell had his own struggles with MATLAB related to memory management and was looking for options to further his academic work at the University of Chicago. As he dove into Python, he naturally ran into Matplotlib and, as mentioned, first contributed insight and eventually code. That code was made all the better under the tutelage of Mike Droettboom, who assumed project leadership after Hunter’s unfortunate passing in 2012. As Caswell remembers, “Droettboom taught me almost everything I know about programming.” Caswell worked closely with Droettboom and, over time, became Matplotlib’s lead maintainer.

How Caswell’s Matplotlib contributions evolved is worth noting, because this evolution is a useful guide for others who may want to start contributing to an open source project.

Caswell notes that answering Stack Overflow questions turns out to be an exceptional way to learn a library, because it puts you into a position to encounter others’ use cases. It was also an ideal way to start “fixing” bugs in the code without touching the code. Caswell says that eventually he was given commit rights so that he could apply pressure on the bug backlog in the other direction.

At the same time, Caswell’s experience surfaces another facet of community-driven open source projects: you can’t force it. Caswell says that over the past several years, Matplotlib’s development has been entirely volunteer-driven — by a combination of people from industry who do it either on their discretionary time at work or on nights and weekends, and a collection of professors and students. He says this makes for an interesting management problem, because you can’t tell anyone to do anything. There is “no coercion” in the community — just persuasion.

Add to this the interesting conflicts that arise when you have primarily text communication between people from different cultural backgrounds, he says, and managing an open source community ends up offering MBA-level experience to people who likely have zero interest in an MBA.

Familiar but Different

Over the years, one of the guiding principles of Matplotlib has been to retain some connection to MATLAB while also innovating. That tie back to MATLAB has been important, because so much of the potential user community has historically started with MATLAB while in science and engineering classes at universities.

Herein lies one of the great strengths of Matplotlib, as well as a fundamental tension: how to balance familiarity with innovation.

As Python has become the de facto language for data science, and is widely taught in universities, an ever-rising percentage of Matplotlib users have never used MATLAB. This frees the Matplotlib community from needing to hew to the MATLAB standard. Caswell says the benefit of MATLAB  familiarity is starting to wane. He’s quick to add, however, “The Python world is not killing MATLAB. They’re also growing like crazy. We’re just growing faster.”

But what to build to stoke future growth?

“If you make a change that costs all of your users two hours, that’s a huge hit to global productivity,” Caswell says, prompting the project community to take care about introducing changes to the API. At the same time, he says you must evolve and add new features to keep up with evolving user requirements. “If you don’t keep up, you’re going to get replaced,” he says. This balance is a key tension that Caswell — and other project maintainers — must deal with on an ongoing basis.

Making Matplotlib Pay

Given this utility for so many others, I asked Caswell how much Matplotlib contributes to his work at Brookhaven National Laboratory (BNL), which is used “everywhere” within BNL. Caswell spends five to ten percent of his work time contributing upstream to Matplotlib. That percentage may go up, thanks to a $250,000 grant from the Chan Zuckerberg Initiative to help Matplotlib developers address its maintenance backlog, among other things.

Caswell may not do as much of the actual coding for Matplotlib anymore, but as project lead it’s still a significant time commitment — without getting paid for most of that work. Why does he do it?

In his graduate school years, he figured out pretty quickly that he didn’t want to be a professor. “That did not look like fun at all,” he says. Instead he discovered that he loves building tools and really wanted to build better tools for scientists. “That’s the thing that keeps me going,” he concludes. “Thinking about the grad student alone in their lab two stories underground at 11 p.m. on a Saturday. Supporting that person is what keeps me going. That’s my passion.”

There are many ways to contribute to Matplotlib, and all are welcome. If you or your organization use Matplotlib, the community would love to feature your use case on the Matplotlib blog. Visit their How to Contribute page to learn more, and check out the Matplotlib Developers’ Guide to find out how to contribute documentation, bug reports/fixes, or other code.

Feature image via Pixabay.

Since its inception, Amazon Web Services (AWS) has been the best place for customers to build and run open source software in the cloud. AWS is proud to support open source projects, foundations, and partners.
Learn More
The latest from AWS
Hear more from our sponsor
TRENDING STORIES
Matt is a principal at AWS and has been involved in open source and all that it enables (cloud, machine learning, data infrastructure, mobile, etc.) for nearly two decades, working for a variety of open source companies and writing regularly...
Read more from Matt Asay
AWS sponsored this post.
SHARE THIS STORY
TRENDING STORIES
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.