VOOZH about

URL: https://www.geeksforgeeks.org/engineering-mathematics/identifying-outliers-in-statistics-worksheet/

⇱ Identifying Outliers in Statistics Worksheet - GeeksforGeeks


  • Courses
  • Tutorials
  • Interview Prep

Identifying Outliers in Statistics Worksheet

Last Updated : 23 Jul, 2025

In statistics, an outlier is a data point that is significantly different from the rest of the data. It is either much higher or much lower than most of the other values in a dataset.

For example, if you're looking at the ages of people in a group and most are between 20 and 40, but one person is 95, that 95 is an outlier because it's far outside the usual range.

Common Methods to Detect Outliers

Outlier identification is based on numerous different statistical procedures that help determine the points, which are unlikely to be generated by the studied distribution. These methods ensure that analyses remain accurate and representative.

  • Interquartile Range (IQR) Method
  • Z-Score (Standard Score) Method

Interquartile Range (IQR) Method

Interquartile Range (IQR) Method is a widely used technique for detecting outliers in a dataset. It works by identifying values that fall significantly above or below the central range of the data.

In this method, we sort the data and find first and third quartile for data (Q1 and Q3) then use the formula for IQR

IQR = Q3 − Q1

Using IQR, we can find upper and lower bound for the data:

  • Lower Bound = Q1 − 1.5 × IQR
  • Upper Bound = Q3 + 1.5 × IQR

Z-Score Method

Z-Score Method is a statistical technique used to identify outliers by measuring how many standard deviations a data point is from the mean of the dataset. The Z-score helps detect data points that significantly deviate from the average.

The figures that have a Z-score of more than 3 or less than -3 are regarded outlier.

Where:

  • X is the data point
  • μ is the mean of the dataset
  • σ is the standard deviation

Other then these there are some more methods including:

  • Box Plot
  • Visual Inspection (Scatter Plot, Line Plot)
  • Grubbs' Test
  • DBSCAN (Density-Based Spatial Clustering of Applications with Noise)

Examples on Identifying Outliers

Example 1: Consider the dataset: [2, 4, 5, 7, 8, 12, 15, 18, 22, 25, 28].

Solution:

Here,

  • Q1 = 7
  • Q3 = 22

Thus, IQR = 22 − 7 = 15

Determine the bounds for outliers:

  • Lower Bound = 7 − 1.5 × 15 = −15.5
  • Upper Bound = 22 + 1.5 × 15 = 44.5

Since no data points are below -15.5 or above 44.5, there are no outliers in this dataset.

Example 2: Consider the dataset: 12, 15, 17, 22, 29, 150, 16, 13, 18, 19

Identify any outliers using the IQR method.

Solution:

Sorted Data in ascending order: 12, 13, 15, 16, 17, 18, 19, 22, 29, 150

Here, Q1 = 15, Q3 = 22
Thus, IQR = Q3 - Q1 = 22 - 15 = 7

  • Lower Bound = Q1 - 1.5 × IQR = 15 - 1.5 × 7 = 4.5
  • Upper Bound = Q3 + 1.5 × IQR = 22 + 1.5 × 7 = 32.5

Result: Any value below 4.5 or above 32.5 is an outlier. In this case, 150 is an outlier

Example 3: Given the dataset [10, 12, 13, 15, 18, 20], calculate the mean (μ) and standard deviation (σ).

Solution:

Using formula, We get

Now,

Calculate the Z-score for each point:

As z-score for any value doesn't lie outside the -3 to 3 range. Thus, there is no outlier in this dataset.

Example 4: Consider the dataset: Data=[56, 57, 58, 60, 61, 63, 65, 67, 90]

Identify any outliers using the Z-score method.

Solution:

Given: 56, 57, 58, 60, 61, 63, 65, 67, 90


Calculate the Z-score for each data point using formula:

Since all Z-scores fall within the range of −3 to 3, there are no outliers in this dataset.

Worksheet on Identifying Outliers

👁 Worksheet-on-Identifying-Outliers

You can download this free worksheet on identifying outliers in dataset from below:

Download Free Worksheet on Identifying Outliers

Also Check,

Comment