![]() |
VOOZH | about |
Social media users frequently encounter abuse, harassment, and insults from other users on a majority of online communication platforms like Facebook, Instagram and Youtube due to which many users stop expressing their ideas and opinions.
What is the solution?
The solution to this problem is to create an effective model that can identify the level of toxicity in comments such as threats, obscenity, insults, racism, etc. Thereby, promoting a peaceful environment for online dialogue.
In this article, we will understand more about Toxic comment multi-label classification and create a model to classify comments into various labels of toxicity.
The toxicity class refers to any comment or text containing offensive or hurtful words. This can involve insults, slurs or other offensive language.
Every supervised classification technique can be further subdivided into three groups based on the number of categories it uses:
1. Binary classification:
It is a type of supervised machine-learning problem that classifies data into two mutually exclusive groups or categories. The two categories can be classified as true and false, 0 and 1, positive and negative, etc.
In toxic comment classification, the model is trained to predict whether a comment is toxic (class 1) or non-toxic (class 0).
Example:
"I hate you!" Predicted class: Toxic (class 1)
"I like you!" Predicted class: Non-toxic (class 0)
2. Multiclass classification:
It is a type of supervised machine-learning problem that classifies data into three or more groups/categories.
A multiclass classifier for Toxic comment classification is trained to detect various degrees of toxicity in comments, such as mild toxicity, severe toxicity, and non-toxic comments, as opposed to just differentiating between toxic and non-toxic comments (binary classification).
Example:
"I want to kill you!" Predicted class: Severe toxicity
"You are so ugly and unconfident" Predicted class: Mild toxicity
"You are a good person" Predicted class: Non-toxic
3. Multilabel classification: Multilabel classification is a supervised machine learning approach where a single instance can be associated with multiple labels simultaneously. It allows the model to assign zero, one, or more labels to each data sample based on its characteristics.
In the context of toxic comment classification, a comment or text can be labelled with multiple toxicity categories if it contains various forms of harmful language.
Example:
"You're an idiot person, and I hope someone hits you!"
Multiple Labels: Offensive language (class 1), Threats (class 1), hatred (class1), non_toxic(class 0)
Let's get started!
About the dataset:
We have a large number of Wikipedia comments which have been labelled by human raters for toxic behaviour. The dataset variables are:
Access the dataset: Toxic Comments dataset
Now, the coding part begins!
Utilizing PyTorch with transformers, for a more flexible and intuitive interface for building and training deep learning models
!pip install torchTransformers for using BERT(Bidirectional Encoder Representations from Transformers)
!pip install transformersOutput:
id comment_text toxic \
0 0000997932d777bf Explanation\nWhy the edits made under my usern... 0
1 000103f0d9cfb60f D'aww! He matches this background colour I'm s... 0
2 000113f07ec002fd Hey man, I'm really not trying to edit war. It... 0
3 0001b41b1c6bb37e "\nMore\nI can't make any real suggestions on ... 0
4 0001d958c54c6e35 You, sir, are my hero. Any chance you remember... 0
severe_toxic obscene threat insult identity_hate
0 0 0 0 0 0
1 0 0 0 0 0
2 0 0 0 0 0
3 0 0 0 0 0
4 0 0 0 0 0
Output:
👁 Distribution of Label Occurrences-Geeksforgeeks
Checking exact values for each class
Output:
threat 478
identity_hate 1405
severe_toxic 1595
insult 7877
obscene 8449
toxic 15294
dtype: int64
Let's check if the data is balanced or not by comparing toxic and clean comments by creating their subsets, and then create a new data frame to visualize and gain insights on the distribution of the dataset.
Output:
👁 Distribution of Toxic and Clean Comments-Geeksforgeeks
We can observe that our dataset is severely imbalanced.
Let's have a look at the proportion of toxic and clean comments in numbers in order to know the exact numbers and balance the data accordingly.
Output:
(16225, 8)
(143346, 8)
There is a huge difference in the dataset between toxic and clean comments.
To handle the imbalanced data, we can create a new training set in which the number of toxic comments remains the same, and to match that, we will randomly sample 16,225 clean comments and include them in the training set.
The new balanced data frame
let's verify with actual figures
Output:
(16225, 8)
(16225, 8)
(32450, 8)
Now, the dataset is balanced with exactly equal instances of toxic and clean comments we can proceed further to tokenizing and encoding comments using BertTokenizer.
In this step, we split the data into training, validation, and testing sets. The data is divided into training and testing sets first, and then the testing set is further split into validation and testing sets.
Now, we split the validation set
Now, we will tokenize and encode the comments and labels for the training, testing, and validation sets.
Defining 'tokenize_and_encode' function to perform this task
Now, we will Initialize the BERT tokenizer with the 'bert-base-uncased' model
After this step, we will initialize the BERT model for sequence classification
Now, an additional step for faster processing of the model. You can move the model to the GPU if available, or to the CPU if not.
Tokenize and Encode the comments and labels of the train, test and validation set
Output:
Training Comments : (22715,)
Input Ids : torch.Size([22715, 128])
Attention Mask : torch.Size([22715, 128])
Labels : torch.Size([22715, 6])Let's check an encoded text with the corresponding text and labels
Output:
Training Comments -->> I have edited the text and wrote with neutral information. Please suggest what went wrong.
Input Ids -->>
tensor([ 101, 1045, 2031, 5493, 1996, 3793, 1998, 2626, 2007, 8699, 2592, 1012,
3531, 6592, 2054, 2253, 3308, 1012, 102, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
Decoded Ids -->>
[CLS] i have edited the text and wrote with neutral information. please suggest what went wrong. [SEP]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD]
Attention Mask -->>
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
Labels -->> tensor([0., 0., 0., 0., 0., 0.])Now, we will create data loaders to efficiently load the data during training, testing, and validation. The data loaders batch the input data and handle shuffling for the training data.
Let's check the train_loader data
Output:
Batch Size : 32
Each Input ids shape : torch.Size([32, 128])
Input ids :
tensor([ 101, 2175, 3280, 1999, 1037, 2543, 1012, 1045, 2123, 2102,
2228, 3087, 2106, 2062, 4053, 2000, 16948, 2059, 2017, 1999,
1996, 2197, 2048, 2086, 1012, 9119, 1010, 3246, 2017, 2123,
2102, 2272, 2067, 2007, 1037, 28407, 13997, 1006, 2029, 2017,
2471, 5121, 2097, 999, 999, 999, 1007, 6109, 1012, 6564,
1012, 2382, 1012, 19955, 102, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
Corresponding Decoded text:
[CLS] go die in a fire. i dont think anyone did more damage to wikipedia then you in the last two years. goodbye,
hope you dont come back with a sock puppet ( which you almost certainly will!!! ) 93. 86. 30. 194 [SEP]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
[PAD] [PAD] [PAD] [PAD] [PAD]
Corresponding Attention Mask :
tensor([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0])
Corresponding Label: tensor([1., 0., 1., 0., 1., 0.])AdamW optimizer: We are using AdamW optimizer which refers to Adaptive Moment Estimation. It combines the advantages of RMSprop (Root Mean Square Propagation) and AdaGrad (Adaptive Gradient Algorithm), two additional optimization strategies.
For each model parameter, it includes moving averages of the gradient and the squared gradient, which aid in adjusting the learning rates for various parameters during training.
Output:
Epoch 1, Training Loss: 0.20543626952968852,Validation loss:0.1643741050479459
Epoch 2, Training Loss: 0.13793433358971502,Validation loss:0.14861836971021167
Epoch 3, Training Loss: 0.11418234390587034,Validation loss:0.1539663544862099
let's evaluate the model now
Output:
Accuracy: 0.7099
Precision: 0.8059
Recall: 0.8691
Now, we can evaluate the model based on the metrics results achieved here.
Now, load the model
Now, comes the interesting part!
let's predict user input
Output:
{'toxic': 1,
'severe_toxic': 0,
'obscene': 0,
'threat': 0,
'insult': 0,
'identity_hate': 0}
We can observe that the comment 'Are you insane!' is a toxic comment.
let's check for more inputs
Output:
{'toxic': 0,
'severe_toxic': 0,
'obscene': 0,
'threat': 0,
'insult': 0,
'identity_hate': 0}
Well, obviously the comment 'How are you?' is not toxic, hence all the other label values are 0
Output:
{'toxic': 1,
'severe_toxic': 0,
'obscene': 1,
'threat': 0,
'insult': 1,
'identity_hate': 0}
As we can see, the comment "Such an Idiot person" shows true for labels toxic, obscene and insult which is right. It is definitely not a threat or identity threat so those values come out to be 0.