?
We are working on a deep learning model that predicts masks for brain tumors or skin lesions. What is making a mask? We classify pixels of an image as 1 or 0. If there is a mask in a pixel we state 1, if there is not a mask we state 0. Making pixelwise binary classification of images is called "Semantic Segmentation".
If we are trying to recognize many objects in an image we are performing "Instance Segmentation". Instance Segmentation is a multiclass segmentation. For example, in self-driving cars, objects are classified as car, road, tree, house, sky, pedestrian, etc.
In both semantic(binary) and instance (multiclass)segmentations, we need a loss function for calculating gradients.
Which accuracy-loss function is used for image segmentation?
Let’s see some of our options:
1. Pixel accuracy:
We can compare each pixel one by one with the ground truth mask. But this is very problematic where there is a class imbalance. Let me explain in an example:
When we create a mask for a brain tumor as in Image 1, then it should look like as in Image 2.
Now let’s have a look at the below mask. When we make the following mask for the brain tumor in Image 1, then the accuracy seems to be going up approximately 98%. Why?
Because we check whether the pixels are correctly classified and assigned value 0 for each of them or not. In the MRI image, the part occupied by the tumor is only 2% of the total image and the remaining part is 98%, hence the model is 98% accurate. The accuracy is really high but actually we do not even have a mask! This is called "class imbalance" problem. (We have two classes for one pixel of the image: 1 mask, 0 no mask.)
It can be a better idea to compare only the two masks. Let’s see;
2. Jaccard’s Index (Intersection over Union, IoU)
In this accuracy metric, we compare the ground truth mask(the mask manually drawn by a radiologist) with the mask we create.
Let’s go more simple:
Green region: We estimate 1 and the ground truth is 1. (True Positive, TP)
Blue region: We estimate 1 but the ground truth is 0. (False Positive, FP)
Yellow region: We estimate 0 but the ground truth is 1. (False Negative, FN)
Gray region: We estimate 0 and the ground truth is 0. (True Negative, TN)
Jaccard’s Index is between 0 and 1.
3. Dice Coefficient
Dice coefficient is very similar to Jaccard’s Index. Dice coefficient double counts the intersection(TP).
Dice coefficient is a measure of overlap between two masks.1 indicates a perfect overlap while 0 indicates no overlap.
Dice Loss = 1 – Dice Coefficient. Easy!
We calculate the gradient of Dice Loss in backpropagation.
Why is Dice Loss used instead of Jaccard’s? Because Dice is easily differentiable and Jaccard’s is not.
Code Example:
Let me give you the code for Dice Accuracy and Dice Loss that I used Pytorch Semantic Segmentation of Brain Tumors Project. In this code, I used Binary Cross-Entropy Loss and Dice Loss in one function.
Conclusion:
We can run "dice_loss" or "bce_dice_loss" as a loss function in our image segmentation projects. In most of the situations, we obtain more precise findings than Binary Cross-Entropy Loss alone. Just plug-and-play!
Thanks for reading.
If you want to get into contact, you can email me at s[email protected], or you can find me at https://www.linkedin.com/in/seyma-tas/
Share This Article
Towards Data Science is a community publication. Submit your insights to reach our global audience and earn through the TDS Author Payment Program.
Write for TDS