VOOZH about

URL: https://thenewstack.io/where-do-data-practitioners-prefer-to-collaborate-github/

⇱ Where Do Data Practitioners Prefer to Collaborate? GitHub - The New Stack


TNS
SUBSCRIBE
Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development.
REQUIRED
It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.
Welcome and thank you for joining The New Stack community!
Please answer a few simple questions to help us deliver the news and resources you are interested in.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Great to meet you!
Tell us a bit about your job so we can cover the topics you find most relevant.
REQUIRED
REQUIRED
REQUIRED
REQUIRED
REQUIRED
Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

What’s next?

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Follow TNS on your favorite social media networks.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

PREV
1 of 2
NEXT
VOXPOP
As a JavaScript developer, what non-React tools do you use most often?
Angular
0%
Astro
0%
Svelte
0%
Vue.js
0%
Other
0%
I only use React
0%
I don't use JavaScript
0%
Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter:
NEW! Try Stackie AI
From clobbered drafts to real-time sync
Apr 14th 2026 10:00am, by David Moore
TypeScript 6.0 RC arrives as a bridge to a faster future
Mar 14th 2026 9:00am, by Darryl K. Taft
Mastra empowers web devs to build AI agents in TypeScript
Jan 28th 2026 11:00am, by Loraine Lawson
2022-07-11 11:37:05
Where Do Data Practitioners Prefer to Collaborate? GitHub
research,
Open Source

Where Do Data Practitioners Prefer to Collaborate? GitHub

Although the collaboration tools space is getting more crowded, machine learning and data practitioners still prefer GitHub to other tools, says the latest survey by Kaggle.
Jul 11th, 2022 11:37am by Lawrence E Hecht
👁 Featued image for: Where Do Data Practitioners Prefer to Collaborate? GitHub

Two-thirds of data practitioners publicly share their data analysis or machine learning applications, according to The New Stack’s analysis of Kaggle’s latest annual survey of machine learning and data science.

Of those collaborating publicly, 76% said they do so using GitHub. Despite its critics, the platform continues to be one of the most critical parts of the tech stack for developers and non-developers building data and artificial intelligence-enabled applications.

In 2021, over 25,000 people took the survey. Since many of the participants were using the Google-owned Kaggle platform to learn how to become data scientists, The New Stack’s analysis only looked at the 17,182 respondents that reported being employed.

Of the 840 machine learning engineers in the study, 61% said they use GitHub for sharing, the highest percentage of any profession in the report to do so. While only 40 developer relations/advocates took part in the study, it is noteworthy that only 45% said they use GitHub to share their applications or analysis.

👁 Where do you publicly share your data analysis or machine learning applications?

Data scientists, software developers and data analysts represented the largest portion of the study’s participants. Here are a few more takeaways from the study:

  • Collaboration tools built for data science, machine learning and artificial intelligence use cases did not see widespread adoption in the Kaggle survey. Of the study participants who said they collaborated publicly, a third used Kaggle itself and 20% used Colab, which is also a Google product. Since these offerings are affiliated with the survey itself, we don’t think they represent anything about the larger market.
  • Streamlit, which was bought by Snowflake earlier this year, was cited as a preferred collaboration tool by 4%. In May, Streamlit’s former CEO described the rise of data-driven apps to The New Stack.
  • Open source nbviewer and Plotly Dash, which has turned a popular open source visualization tool into a low-code platform, were two other ways data-analysis ML apps are shared.

IDEs and Collaboration

Collaboration is also taking place in and between notebooks, which have taken on a life of their own as integrated development environments (IDEs). Just like most developers, the average data practitioner uses more than one IDE, but some flavor of a Juypter or JuypterLab is most common, with Visual Studio Code placing second. Yet, many types of hosted notebooks are struggling to catch on in a crowded field:

  • More than a third of the study’s participants reported using Kaggle and Colab Notebooks. Google appears to be having success turning these users into paying customers for its other notebook and cloud offerings.
  • Eight percent are using Binder, which turns a Git repo of Juypter notebooks into an interactive live environment.
  • While overall 7% of the study said they use specific Amazon Web Services and Microsoft Azure notebook offerings. However, over 15% of AWS and Microsoft Azure cloud computing customers are also using a notebook or other AI-type solution from their cloud provider.
  • Databricks and IBM offerings were got more than passing mentions, but niche products
  • Deepnote, Code Ocean, Gradient, and Observable were each used by only 1% of the study.

We are still in the early days of data-enabled applications. Most data analysts are not interested in software licensing or which code repository they use. They want to go where the data is and where people are most likely to be sharing their models. According to Meltano, a company spun off by GitLab itself, that’s GitHub.

I could provide a huge list of low-code platforms, DataOps pipeline integrations, collaboration tools, and next-generation Airtables, many with strong followings. But few, if any of them are truly close to mass adoption. Some have reached viability as niche products, in niche industries, but only variations of Juypter notebooks and GitHub seem to be familiar enough to non-technical audiences, data pros and developers to become a breakthrough hit.

What do you think? How can the modern data stack break out of the pattern without stifling collaboration? You can reach out here.

TRENDING STORIES
Lawrence has generated actionable insights and reports about enterprise IT B2B markets and technology policy issues for over 25 years. He regularly works with clients to develop and analyze studies about open source ecosystems. In addition to his consulting work,...
Read more from Lawrence E Hecht
SHARE THIS STORY
TRENDING STORIES
Amazon Web Services and GitLab are sponsors of The New Stack.
TNS owner Insight Partners is an investor in: Databricks.
SHARE THIS STORY
TRENDING STORIES
TNS DAILY NEWSLETTER Receive a free roundup of the most recent TNS articles in your inbox each day.
The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy.