InfoQ Homepage News Study Shows AI Coding Assistant Improves Developer Productivity
Study Shows AI Coding Assistant Improves Developer Productivity
This item in japanese
Sep 24, 2024 2 min read
Write for InfoQ
Feed your curiosity. Help 550k+ globalsenior developers
each month stay ahead.Get in touch
Researchers from Microsoft, MIT, Princeton University, and the Wharton School of the University of Pennsylvania recently published a study that showed the use of GitHub Copilot increased developer productivity. The team conducted three separate randomized controlled trials (RCT) involving over 4,000 developers; the ones using Copilot achieved a 26% increase in productivity.
The three experiments were performed at Microsoft, Accenture, and an "anonymous Fortune 100 electronics manufacturing company." For each of the 4,867 developers in the study, the researchers measured the weekly number of pull requests, commits, and code builds performed. They found that developers using Copilot had an average increase of 26.08% in the number of pull requests completed per week. They also found that productivity varied by developer experience, with less experienced developers getting more benefit from Copilot. According to the research team:
Our work complements both the literature on lab experiments as well as these observational studies by studying the impact of generative AI using a field experiment in an actual workplace setting. To date, there is still a dearth of experimental studies examining the effect of generative AI in a field setting.
The experiments were conducted in 2022 and 2023, using a version of Copilot based on GPT-3.5. At Microsoft and Accenture, developers in the experiment were randomly selected to use Copilot, while in the anonymous company, all devs were granted access eventually, but with randomly selected start dates. In addition to tracking the developer productivity measures, the researchers tracked Copilot adoption and usage.
The research team analyzed their results across all devs as well as by developer tenure and skill level. They found that short-tenured and junior devs were more likely to adopt Copilot and to continue using it for more than one month, and that these devs were more likely to accept the output code generated by Copilot. They also experienced the most productivity gain from the tool.
Wharton professor Ethan Mollick shared the results in a thread on X, writing:
We now have randomized controlled trials showing large performance gains in real companies for coding, management, entrepreneurship, and writing using AI.
In a discussion about the study on Hacker News, several users shared that the paper's results matched their own experience with Copilot. One user wrote:
The most interesting thing about this study for me is that when they break it down by experience levels, developers who are above the median tenure show no statistically significant increase in productivity...Copilot is nice for resolving some tedium and freeing up my brain to focus more on deeper questions, but it's not as world-altering as junior devs describe it as. It's also frequently subtly wrong in ways that a newer dev wouldn't catch, which requires me to stop and tweak most things it generates in a way that a less experienced dev probably wouldn't know to.
The effect of generative AI on employee productivity, and specifically software developer productivity, is an open research area. Earlier this year, InfoQ covered a survey by Upwork Research Institute, where a majority of employees surveyed actually reported that GenAI decreased their productivity. InfoQ also covered a study by eBay where GitHub Copilot did increase developer productivity.
About the Author
Anthony Alford
This content is in the AI, ML & Data Engineering topic
Related Topics:
-
Related Editorial
-
Related Sponsors
-
Popular across InfoQ
-
ArrowJS Reaches 1.0, Recast as the First UI Framework for the Agentic Era
-
Anthropic Releases and Temporarily Suspends Claude Fable 5
-
Slack Eliminates SSH in EMR Pipelines, Migrates 700+ Jobs to Rest-Based Architecture
-
Anthropic Explains How Claude Builds Its Own Execution Harnesses
-
Spring Boot 4.1 Adds gRPC Auto-Configuration, SSRF Mitigation, and Kotlin 2.3 Support
-
Increasing Users' Data Agency: From BlueSky's AT Protocol to the Local-First Software Movement
-
The InfoQ Newsletter
A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example
