![]() |
VOOZH | about |
A popular nonparametric(distribution-free) test to compare outcomes between two independent groups is the Mann Whitney U test. When comparing two independent samples, when the outcome is not normally distributed and the samples are small, a nonparametric test is appropriate. It is used to see the distribution difference between two independent variables on the basis of an ordinal(categorical variable having intrinsic an order or rank) dependent variable. It's very much easy to perform this test in R programming.
Let's say we have two kinds of bulbs say orange and red in our data and these are divided on the day to day base prices. So here the base prices are dependent variable on the two categories which are red and orange. So we will try and analyze that if we want to buy a red or orange color bulb which should we prefer on the basis of prices. If both the distributions are the same then this means that the null hypothesis (means no significant difference between the two) is true and we can buy any one of them and prices won't matter. To understand the concept of the Mann Whitney U Test one needs to know what is the p-value. This value actually tells if we can reject our null hypothesis(0.5) or not. Now below is the implementation of the above example.
Output:
> DATASET
BULB_TYPE BULB_PRICE 1 red 38.9 2 red 61.2 3 red 73.3 4 red 21.8 5 red 63.4 6 red 64.6 7 red 48.4 8 red 48.8 9 orange 47.8 10 orange 60.0 11 orange 63.4 12 orange 76.0 13 orange 89.4 14 orange 67.3 15 orange 61.3 16 orange 62.4
# summary of the data
summarise()` ungrouping output (override with `.groups` argument) # A tibble: 2 x 4 BULB_TYPE count median IQR <fct> <int> <dbl> <dbl> 1 orange 8 62.9 8.5 2 red 8 55 17.7
# boxplot
👁 output boxplot> res
Wilcoxon rank sum test with continuity correction data: BULB_PRICE by BULB_TYPE W = 44.5, p-value = 0.2072 alternative hypothesis: true location shift is not equal to 0
Explanation:
Here as we can see that the value of p is coming out to be 0.2072 which is far less than the null hypothesis(0.5). Due to which it will be rejected. And it can conclude that the distribution of prices over red and orange bulbs is not the same. Due to which it cannot say that if it is profitable to buy any one of the above bulbs is profitable.