![]() |
VOOZH | about |
The most important thing we do after making any model is evaluating the model. We have different evaluation matrices for evaluating the model. However, the choice of evaluation matrix to use for evaluating the model depends upon the type of problem we are solving whether it's a regression, classification, or any other type of problem. In this article, we will explain R-Square for regression analysis problems.
R-squared is a statistical measure that represents the goodness of fit of a regression model. The value of R-square lies between 0 to 1. Where we get R-square equals 1 when the model perfectly fits the data and there is no difference between the predicted value and actual value. However, we get R-square equals 0 when the model does not predict any variability in the model and it does not learn any relationship between the dependent and independent variables.
R-squared also known as the coefficient of determination measures the variability in the dependent variable Y that is being explained by the independent variables Xi in the regression model.
We calculate R-Square in the following steps
R-square is a comparison of the residual sum of squares(SSres) with the total sum of squares(SStot). The total sum of squares is calculated by summation of squares of perpendicular distance between data points and the average line. 👁 Image
The residual sum of squares is calculated by the summation of squares of perpendicular distance between data points and the best-fitted line.
R square is calculated by using the following formula :
Where SSres is the residual sum of squares and SStot is the total sum of squares.
The goodness of fit of regression models can be analyzed on the basis of the R-square method. The more the value of the r-square near 1, the better the model is.
Note: The value of R-square can also be negative when the model fitted is worse than the average fitted model.
Adjusted R-Squared is an updated version of R-squared which takes account of the number of independent variables while calculating R-squared. The main problem with R-squared is that the R-Square value always increases with an increase in independent variables irrespective of the fact that where the independent variable is contributing to the model or not. This leads to the model having high variance if the model has a lot of independent variables.
Formula For Adjusted R-Squared