VOOZH about

URL: https://thenewstack.io/pythons-developer-in-residence-probes-pull-request-patterns/

⇱ Python's Developer-in-Residence Probes Pull Request Patterns - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2021-11-23 06:00:26
Python's Developer-in-Residence Probes Pull Request Patterns
feature,
Open Source / Software Development / Tech Culture

Python’s Developer-in-Residence Probes Pull Request Patterns

The Python Software Foundation's new hire is crunching the data to discover who's maintaining the open source language and how to support them.
Nov 23rd, 2021 6:00am by David Cassel
👁 Featued image for: Python’s Developer-in-Residence Probes Pull Request Patterns

As a long-time contributor to the Python programming languageŁukasz Langa recently served as the release manager for Python versions 3.8 and 3.9. But the Poland-based developer is also the first person to hold the newly-created position of developer-in-residence for CPython (the reference implementation of the Python programming language).

Established by the nonprofit Python Software Foundation, the developer-in-residence position will “assist CPython volunteer maintainers and the Steering Council,” according to a Python Foundation blog post in April.

Among the position’s duties are “analytical research to understand the project’s volunteer hours and funding.”

This is how the world ended up with a fascinating blog post offering detailed, data-driven insights into where Python comes from, with statistics on everything from the distribution of pull requests to how Python’s core developers spend their time.

So you know that #Python‘s codebase is over a million lines of code, right? 🤯

But did you know where the work is going these days, who does what, and what you can expect when you open your own PR? 👨🏻‍🔬

Neither did I… so I went and did some research:https://t.co/8fTgXrXJtJ

— Łukasz Langa (@llanga) October 18, 2021

In one way it’s the story of a programming language — how, through thousands of small pull requests, it continues to evolve.

But Langa’s own involvement is also unique, ultimately proving that the story behind the story is just as interesting.

A Sponsored Residency

Langa sees his residency in the larger sweep of history, remembering the days when Python was developed only by volunteers — and by Guido van Rossum, who held a paid position at Dropbox where it was understood he’d also use his time to work on the programming language.

“And many of us were in the same situation,” Langa reminisced in a late-August appearance on the “Talk Python” podcast. “I was tolerated as a CPython core developer at Facebook, and some others were at their own respective companies.

“I was super frustrated by this, because of this tremendous value that we’re giving to the entire community, including multibillion-dollar corporations.”

But now he’s fulfilling Python’s first residency in a new format — sponsored by Google, but run through the Python Software Foundation.

“That is amazing. I am very happy that they did, because I really believe that this is something that might alter how we think about maintenance of community-driven projects like Python,” Langa said, adding, “This is a kind of a game-changer. We have not done this way of sponsoring a project before, where we’re actually thinking about the ‘software’ word in ‘Python Software Foundation’ — where we directly sponsor work on the source code.”

Langa takes his responsibility seriously, as his remarks on the podcast make clear: “I do believe my particular performance kind of will make or break future ideas on whether this should be extended to more people, right?”

He laughed, then added playfully “Or just closed down altogether! So it’s not only providing value to the project, it’s literally providing proof that this development model works. So yeah, there’s certain responsibility around it,”

Patterns in Pull Requests

Langa’s blog post emphasized the importance of transparency and visibility in the developer-in-residence position — which for him includes blogging regularly about the experience, keeping the rest of the community involved in his journey.

One of the tasks given to him, he wrote, was to search for patterns in the pull requests for libraries (as well as identifying their top contributors). So Langa began by exploring years of historical data from the python/cpython Git repository and its pull requests — converting that data into Python objects (with scripts he’s shared on GitHub) and then transforming it into a SQLite file, exploring it all with the data exploration/visualization tool Datasette.

The first interesting find? Which dates had the most merges.

“Right away you see that September 2019 was the most active recorded week in our database in terms of merges,” Langa wrote on his blog, adding “That’s no surprise, it was the week of our annual core sprint, that year happening at Bloomberg in London.”

Langa quickly generated a bar graph, where the bars for the sprint days tower over the other bars, between two and three times taller. He called it “tangible evidence those events are worth it.”

But soon he’d arrived at the statistics that can finally answer the question: where do Python’s core developers spend their time?

CPython consists of over 629,000 lines of Python code and more than 550,000 lines of C code. But by querying the data, Langa was able to identify the one single file that contains both the most changes since the beginning of 2019 (with 259 merged pull requests) and the most lines of code that have been changed (12,972). It’s Python/ceval.c — the 7,080-line file which actually executes the compiled code.

And to Langa’s surprise, No. 2 on the list of most merged pull requests is Python/pylifecycle.c, the interpreter for top-level routines (including init and exit), with 222 merged pull requests.

“Who would think the most change happens the deepest inside the interpreter?” he asked.

Langa was also able to tease out interesting information on who’s making pull requests. Interestingly, the No. 1  most frequent contributor is a GitHub user named miss-islington — a bot that automatically checks merged pull requests for any issues with backporting. (In keeping with Python’s roots, the bot was named after the character in “Monty Python and the Holy Grail who must insist to a mob that she is not a witch.)

“Clearly, it pays to be a bot (like miss-islington, web-flow, or blurb-it),” writes Langa, “or a release manager since this naturally causes you to make a lot of commits.”

The top two human contributors are Victor Stinner (paid by Red Hat to maintain Python upstream) and Serhiy Storchaka (a Ukraine-based Python core developer), both of whom Langa acknowledges for “amazing amounts of activity.” Stinner comes in at No. 2 with 3,775 merged pull requests, while Storchaka has 2,582.

And congrats @VictorStinner for top non-bot spot! :)

— Henry Schreiner III (@HenrySchreiner3) October 19, 2021

So far, nothing proved that I am not a bot.

— Victor Stinner 🐍 (@VictorStinner) October 19, 2021

Langa then tried writing a script identifying the top five contributors for each file — though after breaking it down into 636 categories, he still discovered that for 618 of those categories, two of the top five contributors were. …  Stinner and Storchaka, again. “In fact, some files are missing contributors entirely save for our two top giants,” Langa wrote.

But as reassuring as it was to find them watching over the project, Langa was also able to identify some “experts” who were “laser-focusing” on specific parts of Python (like its handling of email or code for handling types). This was information specifically requested by the Python Software Foundation, so it may prove useful as Python development continues in the future.

Drive-Bys and ‘Transformational Potential’

In a July blog post, Langa even argued that his role had “transformational potential” for Python. “In short, I believe the mission of the developer-in-residence is to accelerate the developer experience of everybody else. This includes not only the core development team, but most importantly the drive-by contributors submitting pull requests and creating issues on the tracker.”

Langa elaborated on the importance of casual contributors on the “Talk Python” podcast. “There’s a lot of us on the core team, and even more people around the core team who are kind of — well, we call them drive-by contributors,” he said.

“They would find an issue, produce a bunch of pull requests, and maybe then kind of disappear … Obviously every year this changes, how Python is developed. We’re going to have a bunch of people who are super invested, and they’re going to be spending crazy amounts of time, including on weekends and whatnot, to work on Python, even for free. I know — I did that for a decade.

“So those contributions are super valued. But usually, those really don’t — well, those people change, right? Like, you can’t really do this in a consistent manner, day-in, day-out, for a long period of time. Your life situation changes, your job changes or whatnot, and, you know, you stop contributing. And what happens to Python then? Well, we lose some value.”

Average Wait Times for Pull Request Merges

Langa thinks one way of encouraging more contributions is to merge more of the pull requests that have been submitted. “Currently we have over 1,400 open pull requests. And I’ve been on a mission to kind of bring that number down,” he wrote. “Currently as I’m looking at it, it’s 1,421.”

So there was another crucial question Langa explored in his Python-related data science: how long does it take to merge a pull request?

His first results showed the average wait time is 14.6 days, Langa wrote. (Although, “obviously, the answer in a big project is ‘it depends.’.Averages lie.”)

More research revealed that indeed, especially for non-core developers, the deviations from the average can be wildly large. Langa’s calculations place their standard deviation at 81.7 days, plus or minus. And while for core developers, the average wait time is nearly 9.5 days — their standard deviation is also high. (It’s at least just under 42 days, plus or minus — but it jumps to 77.4 days if the core developers aren’t merging their own pull requests.)

Meanwhile, pull requests that don’t get merged — but are closed instead — wait, on average, more than 105 days, Langa wrote. “But as I said, averages lie.”

The next steps are still to be determined. But, for now, there are fresh data-driven perspectives on the current state of Python development.

It’s all part of how Python’s first developer-in-residence is keeping the community involved in his journey — and he’s now even inviting them to help choose what he should investigate next.

“If you have any suggestions on things I could look at,” Langa’s blog post concluded, “let me know!”

TRENDING STORIES
David Cassel is a proud resident of the San Francisco Bay Area, where he's been covering technology news for more than two decades. Over the years his articles have appeared everywhere from CNN, MSNBC, and the Wall Street Journal Interactive...
Read more from David Cassel
SHARE THIS STORY
TRENDING STORIES
Red Hat is a sponsor of The New Stack.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.