![]() |
VOOZH | about |
Residual Sum of Squares is essentially the sum of the squared differences between the actual values of the dependent variable and the values predicted by the model. This metric provides a numerical representation of how well the model fits the data, with smaller values indicating a better fit and larger values suggesting a poorer fit.
For example, for predicting retail store sales based on advertising spend using a linear regression model. Calculate the Residual Sum of Squares (RSS) by finding the squared differences between actual and predicted sales to assess model fit.
The scatter plot on the right displays the residuals, which are the differences between actual sales and predicted sales, plotted against advertising spend.
Ideally, we want these residuals to be randomly scattered around the horizontal zero line. If they are, it indicates that our model fits the data well.
However, in this case, we can see some patterns in the residuals, which suggests that our model may not be capturing all the underlying relationships in the data. This could mean we need a more complex model to better understand the relationship between advertising and sales.
In regression analysis, RSS is one of the three main types of sum of squares, alongside the Total Sum of Squares (TSS) and the Sum of Squares due to Regression (SSR) or Explained Sum of Squares (ESS).
Residual Sum of Squares (RSS) can be calculated using the following formula:
Where,
- is the ith value of variable to be predicted,
- is the predicted value, and
- n is the number of terms or variables.
The regression sum of squares measures how well the model is and how close is the predicted value to the expected value.
Consider a set X with n observations. The sum of squares S for this set can be calculated using the below formula:
Where,
- Xi is the ith observation of the set,
- is the mean of the dataset, and
- n is the number of observations.
Total sum of squares is used to denote the amount of variation in the dependent variable. The total sum of squares is the sum of the regression sum of squares and the residual sum of squares. It is calculated as:
TSS = RSS + SSR
Where the abbreviations have their usual meaning.
We will discuss steps to calculate the sum of squares for both the residual method and regressive method in the following headings.
To calculate the residual sum of squares, we can use the following steps:
Step 1: Organize the data to find the expected value.
Step 2: Calculate the residual i.e., yi - ลทi.
Step 3: Use the following formula to calculate the Residual Sum of Squares.
Step 4: The result is the required value of the Residual Sum of Squares.
To calculate the sum of squares due to regression we can use the following steps:
- Step 1: Calculate the mean of the given data
- Step 2: Calculate the difference between the mean and each data point.
- Step 3: Calculate the square of the value obtained in step 2.
- Step 4: Sum all the values obtained from Step 3.
The sum of squares formula can be used for various purposes and has great significance in real life such as:
The sum of squares has the following limitations:
Problem 1: Calculate the sum of squares of the set X = [1,2,3,6] if the mean is found to be 3.
Solution:
Given
X
1
-2
2
-1
3
0
6
3
Using
S = 4+1+0+9
S = 14
Therefore , The sum of squares of the set is 14.
Problem 2: Calculate the sum of squares of the set X = [3,6,9,12,15] if the mean is found to be 9.
Solution:
Given
X
3
-6
6
-3
9
0
12
3
15
6
Using
S = 36+9+0+9+36
S = 90
The sum of squares of the set is 90.
Problem 3: Calculate the sum of squares of the dataset X = [1,2,3,4,5,6]
Solution:
In this case we need to calculate the mean first.
= 21/6
X
1
-2.5
2
-1.5
3
-0.5
4
0.5
5
1.5
6
2.5
Using
S = 6.25+2.25+0.25+0.25+2.25+6.25
S = 17.50
The sum of squares of the set is 17.50.
Problem 4: Calculate the sum of squares of the dataset Y = [3,4,5,1,7]
Solution:
In this case we need to calculate the mean first.
= 20/5
X
3
-1
4
0
5
1
1
-3
7
3
Using
S = 1+0+1+9+9
S = 20
The sum of squares of the set is 20.
Problem 5: Calculate the sum of squares of the set X = [1,4,6,8] if mean is found to be 4.75.
Solution:
Given
X
1
-3.75
4
-0.75
6
1.25
8
3.25
Using
S = 14.0625+0.5625+1.5625+10.5625
S = 26.75
The sum of squares of the set is 26.75.