![]() |
VOOZH | about |
Two-Way ANOVA in statistics stands for Analysis of Variance and it is used to check whether there is a statistically significant difference between the mean value of three or more. It interprets the difference between the mean value of at least three groups. Its main objective is to find out how two factors affect a response variable and to find out whether there is a relation between the two factors on the response variable.
Let us consider an example in which scientists need to know whether plant growth is affected by fertilizers and watering frequency. They planted exactly 30 plants and allowed them to grow for six months under different fertilizers and watering frequency. After six months, they recorded the heights of each plant in centimeters. Below are the step by step implementation:
First we will import numpy, pandas and statsmodels.
Let us create a pandas DataFrame that consist of the following three variables:
To perform the two-way ANOVA the Statsmodels library provides us with anova_lm() function.
Output:
The p-values for the factors in the output show that none of the factors significantly affect plant height. The p-values for Fertilizer (0.913305), Watering (0.990865) and the interaction between Fertilizer and Watering (0.904053) are all greater than 0.05 indicating no significant effects on plant height i.e there is no evidence to reject the null hypothesis for each factor. The residual represents the unexplained variance in the model with 28 degrees of freedom meaning there are 28 residual observations.