VOOZH about

URL: https://towardsdatascience.com/the-many-faces-of-bias-c515cd483db4/

⇱ The Many Faces of Bias | Towards Data Science


The Many Faces of Bias

Our weekly selection of must-read Editors' Picks and original features

3 min read
👁 Photo by Fleur on Unsplash
Photo by Fleur on Unsplash

Bias is a charged term in the fields of data science and machine learning—not in the least because it means so many things for practitioners. Imbalanced statistical distributions are a form of bias, but so is the representation of racial and gender stereotypes in language models’ training data, or the way researchers’ assumptions get baked into the algorithms they build.

In our latest Author Spotlight Q&A, we chatted with Conor O’Sullivan about his growing interest in algorithmic fairness—a subfield devoted to countering biases in a wide range of data science practices and workflows. So it only felt natural to expand on this topic in this week’s Variable, where we highlight several recent articles that approach bias with great nuance, and from multiple angles.

  • A useful primer on preventing and removing bias from datasets. If you’ve just recently tapped into conversations around fairness and bias, a very good place to start is Ella Wilson‘s debut TDS post. It defines the core concepts you need to know, and also introduces some of the main approaches to tackling the problem of bias in training data.
  • A close look at the social impacts of bias. "When someone practices data science, they are either challenging or enforcing an existing structure of power." The starting point of Aisulu Omar‘s thought-provoking article is that working with massive datasets isn’t inherently good or bad, but that the combination of non-diverse teams and under-informed individual practitioners can cause (or perpetuate) harm.
  • On the issue of inference and multiple treatments. Turning to the statistical side of things, Matteo Courthoud‘s latest explainer is a lucid and engaging analysis of a recent paper on contamination bias: the problem that arises when we want to observe the effects of multiple, mutually exclusive treatments in contexts like experimental drugs, UX design, or policy debates.
  • Evaluating survival analysis models correctly. Issues surrounding accurate interpretation are also at the core of Nicolo Cosimo Albanese‘s deep dive on performance-evaluation metrics for survival analysis. He covers the ins and outs of several common metrics, and shares examples (in Python) to show readers how to go about choosing the right one.

If you’d like to exercise a few other data-science muscle groups this week—and why wouldn’t you?—here are a few recommended reads, spanning a wide spectrum of topics and approaches.


If you felt inspired to become a Medium member recently to support our authors’ work, we truly appreciate it! And we’re always grateful to all our readers and followers for keeping our community vibrant and supportive.

Until the next Variable,

TDS Editors


Written By

TDS Editors

Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

Related Articles