Man is to Doctor as Woman is to Nurse: the Dangerous Bias of Word Embeddings

Why we should worry about gender inequality in Natural Language Processing techniques

Mar 8, 2019

10 min read

Culture and Language, Language and Machines

Gender in an Elevator

Imagine you are stepping in the elevator of a hospital with two people: a man and a woman. They wear the same white coat, but their badges are covered so you can’t tell what is their role within the hospital. At some point, the elevator doors open and a young lady pops up saying: "good morning, doctor Smith!". Which one do you think is going to greet her back?

Common sense says that both are equally likely to be doctor Smith, but several experiments show that individuals are much more likely to associate women with nursing and men with doctoring. That being said, you’d probably end up turning your head towards the white-coated guy, expecting him to say something.

👁 If you are a sci-fi lover, you probably know that doctor Smith's gender can't be established without knowing which version of Lost in Space you're referring to. [© Netflix, CBS]

If you are a sci-fi lover, you probably know that doctor Smith’s gender can’t be established without knowing which version of Lost in Space you’re referring to. [© Netflix, CBS]

How Language Reflects (Bad) Culture

Gender stereotypes are still deeply rooted in our society, and women grow up and live being treated very differently from men through conscious and unconscious gender biases.

Language is one of the most powerful means through which sexism and gender discrimination are perpetrated and reproduced. Lexical choices and everyday communication constantly reflects this long-standing bias, to the point that language itself is designed to reflect social asymmetries. For example, grammatical and syntactical rules are built in such a way that feminine terms are usually derived from their corresponding masculine form (e.g. princ-ess, god-dess, etc.) [Menegatti et al, 2017]. Similarly, masculine nouns and pronouns are often used with a generic function to refer to both men and women (e.g. _man-_kind, states-men, etc.), often being perceived as discriminating against women. These issues are common to most languages, highlighting how gender discrimination spreads across the whole world, involving the society in its entirety.

Since language and culture are so deeply related to eachother, it’s easy to understand the dangerous implication that stereotypes could have on automatic tasks which are based on understanding and processing human languages, from web browsers’ autocompletion to AI-powered bots (remember that time when Microsoft’s AI chatbot went full nazi?) and query results ranking.

Let’s get a bit deeper into it with some techincal-ish details (no hard math involved!).

👁 Surprisingly (?) biased Google Search's autocomplete suggestions from a 2013 query [© UN Women]

Surprisingly (?) biased Google Search’s autocomplete suggestions from a 2013 query [© UN Women]

Gender Bias in NLP: Garbage In, Garbage Out

What is natural language processing?

Natural Language Pocessing (NLP) is a branch of Artificial Intelligence (AI) that helps computers to understand, to interpret and to manipulate natural (i.e. human) language. Imagine NLP-powered machines as black boxes that are capable of understanding and evaluating the context of the input documents (i.e. collection of words), outputting meaningful results that depend on the task the machine is designed for.

👁 Documents are fed into magic NLP model capable to get, for instance, the sentiment of the original content

Documents are fed into magic NLP model capable to get, for instance, the sentiment of the original content

Just like any other machine learning algorithm, biased data results in biased outcomes. And just like any other algorithm, results debiasing is painfully annoying, to the point that it might be simpler to unbias the society itself.

Dear humanity: please stop being sexist. Sincerly, your friendly neighborhood data scientist.

Now, let’s see how the gender bias propagates through NLP models.

The big deal: word embeddings

Words must be represented as numeric vectors in order to be fed into **** machine learning algorithms. One of the most powerful (and popular) ways to do it is through Word Embeddings. In word embedding models, each word in a given language is assigned to a high-dimensional vector, such that the geometry of the vectors captures relations between the words. For instance, the cosine similarity between the vector representation of the word king will be closer to the word queen than to potato. Remember that the cosine similarity is just the cosine of the angle between two vectors and can be defined as:

👁 Image

Therefore, the higher the cosine similarity is, the closer vector directions are. Let’s try to turn this into a simple R script, using a pretrained GloVe Word Embedding (you can find it here).

> word_vec = read.csv('./glove.6B.300d.csv')

> cosine_sim = function(v1,v2){
+ return(dot(v1, v2) / (sqrt(sum(v1^2)) * sqrt(sum(v1^2)))
+}

> cosine_sim(word_vec["king",],word_vec["queen",])
[1] 0.67

> cosine_sim(word_vec["king",],word_vec["potato",])
[1] 0.23

These astonishing results can be achieved thanks to the way word embeddings are learned: following the Distributional Hypotesis, embeddings evaluate in an unsupervised way each target word along with its context (i.e. set of words that comes before and after the target word), building and processing co-occurences matrices (GloVe) or feeding cleverly designed neural networks to solve a word prediction task (Word2Vec and FastText).

In order to preserve the semantics of the natural language in its entirety, word embeddings are usually trained on massive text databases like Wikipedia dumps (6 millions english pages) or Google News (about 100 billion words), inheriting from them all the biases we pointed out previously. Of course, context-specific embeddings can always be learned using smaller datasets.

Kings and queens: where word embeddings actually do the magic

Let’s stick with the words king and queen. As human beings, we know that kings are royal male figures that rule countries, while we use the word queen to describe a woman who is married to a king or is leading a kingdom by herself (queens ruling king-doms…can you see it?). The set of semantic relations revolving around kings and queens are quite straight forward to us, but can be very hard to get for a machine that can’t read between the lines. Word embeddings allow machines to capture (part of) these analogies and relations, including gender.

Since we are talking about numeric vectors, we can estimate the direction of a semantic relation simply by taking the difference between the vector representations of king and man. What happens then if we try to project the woman vector through the same direction? Surprise: we get the word queen. The information that queen is the feminine of king has never been fed directly into the model, but the model is able to capture this relation anyway.

👁 Image

👁 2D representation of the semantic analogies between king, queen, man and woman

2D representation of the semantic analogies between king, queen, man and woman

At this point, you’ll probably guess where this thing is going to end: what if we try to do the same for – let’s say – occupations? As expected, here is where the gender stereotype really strikes in.

> get_analogy = function(v1,v2,v3){
+ x = v2-v1+v3
+ return(names(get_closest(x,word_vec))[1])
+}

> get_analogy(word_vec["man",],
+ word_vec["doctor",],
+ word_vec["woman",])
[1] "nurse"

👁 Image

Damn! Apparently, our algorithm experienced the elevator-in-hospital situation too.

Similar results can be obtained for many different occupations. For instance, if we try to get the missing piece of the man : programmer = female : X analogy we unbelievably end up with X = homemaker. Of course the word programmer (as well as the word homemaker) is neutral to gender by its definition, but an embedding model trained on a news corpus tends to see programmer closer with male than female because of the social perception we have of this job, which is reflected in the language we use. [Bolukbasi et al, 2016]

Gender bias through word clouds

Sexism is even easier to spot when we look at the K-nearest embedding neighbours of man and women. You can guess the target gender of the two clouds even without looking at the description.

👁 Word clouds for the nearest neighbours of "man" (L) and "woman" (R). Topic observed: "occupations"

Word clouds for the nearest neighbours of "man" (L) and "woman" (R). Topic observed: "occupations"

In the man subgroup, we can find words like thug, businessman, mechanic, lieutenant, butcher and cop. Among the most related words in the woman subgroup, instead, we observe terms like midwife, nurse, receptionist, waitress flight attendant and…prostitute. Horrific.

Even scarier results can be retrieved switching the domain of interest from occupations to adjectives. According to the embeddings, men are cocky, decent, crafty, brilliant, clever and humble. Women instead – unsurprisingly at this point – are described as sassy, sexy, tasteful, attractive and gorgeous.

Is this really all we have to describe a woman?

👁 Considering the first 200 closest adjectives to man or woman, 76% of them are referred to "man". Within the remaining 14%, more than 50% are beauty-related terms.

Considering the first 200 closest adjectives to man or woman, 76% of them are referred to "man". Within the remaining 14%, more than 50% are beauty-related terms.

You can try this by yourself with this incredible visualization tool crafted at the Hacking Discrimination hackathon held at Microsoft New England Research & Development Center on 2017.

Implications of a gender-biased AI in real life

Given the fact that gender biases in Natural Language Processing do exist and should be avoided even just because of ethical reasons, what are the implications of a stereotypes-influenced AI for humans’ everyday life? In other words, why should we care about it?

Among all the pros of automatic systems, being uncorruptible, dedicated and dutiful workaholics is probably one of the most important ones. Machine learning systems can determine the eligibility for a loan without being influenced by the race of the applicant, they can provide access to information and services without gender-based discrimination, they can recruit the best candidate for a company without being affected by his/her sexual orientation, etc. However, when machine learning systems start becoming more human-like in their predictions, they could also start perpetuating human behaviours, losing one of their main advantages on humans: not being human.

Let’s consider the job recruitment example. We would like to design a NLP-powered automatic system to generate a job fitness score based on candidates’ motivation letters. Assuming the company HR’s database isn’t big enough to train its own embeddings, we decide to use a 300-dimensional GloVe word embedding pretrained on Wikipedia. Chances to find positive adjectives like crafty, brilliant and clever in a motivation letter are high, but we observed that these terms are closer to man than to woman in the pretrained embedding space. Moreover, let’s say we’re evaluating candidates for a junior programmer position, that we know not to be considered gender neutral by our embeddings. In this scenario, if the model succeeds in retrieving the gender of the candidate, it will be strongly influenced by it during the candidates selection.

👁 Image

Conclusion

Gender inequality is still deeply rooted in our society and the blind application of machine learning algorithms runs the risk of propagating and amplifying all the biases that are present in the original context. Perpetrating this unpleasant human behavior is not only profoundly unethical, but it may also have alarming consequences in many different decision-making scenarios.

Nowadays, data scientists are working hard to solve the problem, ending up with very clever de-biasing strategies [Bolukbasi et al., 2016][Chakraborty et al., 2016] which reduce the gender polarization while preserving the useful properties of the embedding. Nevertheless, we should be very careful in applying NLP techniques in every context where social biases aren’t supposed to make the difference. Acting this way will improve the efficiency of our models, eventually improving ourselves as human beings.

References and further lectures

Here you can find my main sources, along with some very interesting readings which I suggest you to have a look at if you are interested in digging deeper into word embeddings and bias removal.

Hopefully this blog post has been able to highlight the gender inequality liying within our society and the reason why we should worry about gender stereotypes in NLP. Please leave your thoughts in the comments section and share if you find this helpful!

Tommaso Buonocore – Scrittore – Towards Data Science | LinkedIn

Written By

Tommaso Buonocore

See all from Tommaso Buonocore

AI, Gender Equality, Machine Learning, NLP, Word Embeddings

Share This Article

Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.

Write for TDS

URL: https://towardsdatascience.com/gender-bias-word-embeddings-76d9806a0e17/