Practice Questions on Data Handling

Last Updated : 23 Jul, 2025

Data handling refers to the process of managing and manipulating data. It is an interesting concept that has various real-world applications in data analysis, and statistics.

In this article will provide you with all the necessary formulas required to solve Questions on Data Handling along with a set of Practice Questions on Data Handling which will help you to build a solid grasp of various concepts of Data Handling and tackle Questions on Data Handling easily.

Important Formulas for Data Handling

Following are some important formulas helpful in solving questions on Data Handling

Measures of Central Tendency

Mean (μ) = (Σx)/n
Median: Middle value in a sorted dataset or (n + 1)/2th value if n is odd
Mode: Most frequently occurring value in a dataset

Measures of Dispersion

Range = Maximum value - Minimum value
Variance (σ²) = Σ((x - μ)²)/n
Standard Deviation (σ) = √Variance

Correlation

Pearson correlation coefficient (r) = Σ((x - x̄)(y - ȳ)) / √(Σ(x - x̄)² × Σ(y - ȳ)²)

Regression

Linear Regression: y = mx + c (where m is the slope and c is the intercept)
Slope (m) = Σ((x - x̄)(y - ȳ)) / Σ(x - x̄)²
Intercept (c) = ȳ - m×x̄

Hypothesis Testing

Z-test: Z = (X̄ - μ) / (σ / √n), where X̄ is the sample mean, μ is the population mean, σ is the population standard deviation, and n is the sample size
t-test: t = (X̄ - μ) / (s / √n), where s is the sample standard deviation

Practice Questions on Data Handling - Solved

1. The following dataset represents the scores obtained by students in a mathematics exam: [75, 80, 85, 90, 85, 70, 80, 85, 90, 95]. Calculate the mean, median, and mode of the dataset.

Mean = (75 + 80 + 85 + 90 + 85 + 70 + 80 + 85 + 90 + 95) / 10 = 855 / 10 = 85.5
Median = (85 + 85) / 2 = 85
Mode = 85

2. Compute the range, variance, and standard deviation for the following dataset: [10, 15, 20, 25, 30]

Range = Maximum value - Minimum value = 30 - 10 = 20
Mean = (10 + 15 + 20 + 25 + 30) / 5 = 100 / 5 = 20
Variance = [(10 - 20)² + (15 - 20)² + (20 - 20)² + (25 - 20)² + (30 - 20)²] / 5
= (100 + 25 + 0 + 25 + 100) / 5 = 250 / 5 = 50
Standard Deviation = √Variance = √50 ≈ 7.07

3. Calculate the Pearson correlation coefficient (r) for the following dataset:

X: [10, 15, 20, 25, 30]

Y: [20, 25, 30, 35, 40]

Mean of X = (10 + 15 + 20 + 25 + 30) / 5 = 100 / 5 = 20
Mean of Y = (20 + 25 + 30 + 35 + 40) / 5 = 150 / 5 = 30
Σ((x - x̄)(y - ȳ)) = (10 - 20)(20 - 30) + (15 - 20)(25 - 30) + (20 - 20)(30 - 30) + (25 - 20)(35 - 30) + (30 - 20)(40 - 30)
= (-10 × -10) + (-5 × -5) + (0 × 0) + (5 × 5) + (10 × 10)
= 100 + 25 + 0 + 25 + 100 = 250
Σ(x - x̄)² = (10 - 20)² + (15 - 20)² + (20 - 20)² + (25 - 20)² + (30 - 20)²
= 100 + 25 + 0 + 25 + 100 = 250
Σ(y - ȳ)² = (20 - 30)² + (25 - 30)² + (30 - 30)² + (35 - 30)² + (40 - 30)²
= 100 + 25 + 0 + 25 + 100 = 250
r = Σ((x - x̄)(y - ȳ)) / √(Σ(x - x̄)² × Σ(y - ȳ)²)
= 250 / √(250 × 250) = 250 / 250 = 1 ×

4. Perform a t-test for the given dataset to test the hypothesis that the mean is 20:

Dataset: [18, 19, 21, 22, 20, 23, 17, 20, 19, 20]

(Assuming a significance level of 0.05)

Mean = (18 + 19 + 21 + 22 + 20 + 23 + 17 + 20 + 19 + 20) / 10 = 199 / 10 = 19.9
Standard Deviation = √[(Σ(x - x̄)²) / (n - 1)] = √[(16.9 + 9.6 + 0.1 + 4.1 + 0.1 + 9.6 + 5.6 + 0.1 + 0.1 + 0.1) / 9]
= √(45.2 / 9) = √5.022 ≈ 2.24
t = (X̄ - μ) / (s / √n) = (19.9 - 20) / (2.24 / √10) ≈ -0.224
Degrees of Freedom (df) = n - 1 = 10 - 1 = 9
Critical t-value for df = 9 at α = 0.05 (two-tailed) is approximately ±2.262
Since |-0.224| < 2.262, we fail to reject the null hypothesis.

5. The heights (in inches) of a sample of 5 students are as follows: 65, 68, 70, 63, 72. Calculate the mean height of the students.

Mean = (65 + 68 + 70 + 63 + 72) / 5
Mean = 338 / 5
Mean = 67.6 inches

6. Calculate the variance of the following dataset: 5, 8, 10, 12, 15.

Mean = (5 + 8 + 10 + 12 + 15) / 5
Mean = 50 / 5
Mean = 10.
Now, calculate the squared deviations from the mean:
(5 - 10)² = 25
(8 - 10)² = 4
(10 - 10)² = 0
(12 - 10)² = 4
(15 - 10)² = 25
Variance = (25 + 4 + 0 + 4 + 25) / 5
Variance = 58 / 5
Variance = 11.6.

7. What is the correlation coefficient if the covariance between two variables X and Y is 50, the standard deviation of X is 5, and the standard deviation of Y is 10?

Correlation coefficient (r) = Covariance / (Standard deviation of X × Standard deviation of Y)
r = 50 / (5 × 10)
r = 50 / 50
r = 1

8. Perform a t-test with the following data: sample mean = 65, population mean = 60, sample standard deviation = 8, sample size = 25. Assume a significance level of 0.05.

t = (X̄ - μ) / (s / √n)
t = (65 - 60) / (8 / √25)
t = 5 / (8 / 5)
t = 5 / 1.6
t ≈ 3.125.
With a significance level of 0.05 and 24 degrees of freedom (n - 1), the critical t-value is approximately 2.064. Since 3.125 > 2.064, we reject the null hypothesis.

9. Calculate the median of the following dataset: 12, 15, 18, 20, 22, 25, 28, 30.

Since there are 8 data points, the median is the average of the 4th and 5th terms.
Median = (20 + 22) / 2
Median = 21.
Question :
Find the range of the following dataset: 10, 15, 20, 25, 30.
Solution :
Range = Maximum value - Minimum value
Range = 30 - 10
Range = 20

Practice Questions on Data Handling - Unsolved

Q1. Calculate the mode of the following dataset: 12, 15, 18, 20, 22, 25, 28, 30.

Q2. Find the standard deviation of the following dataset: 5, 8, 10, 12, 15.

Q3. Given the following dataset: 18, 20, 22, 24, 26, 28, 30, 32. Perform a Z-test with a sample mean of 25, population mean of 22, sample standard deviation of 4, and a sample size of 20. Use a significance level of 0.05.

Q4. Create a scatter plot for the following dataset:

X: 10, 15, 20, 25, 30

Y: 5, 8, 12, 18, 22

Q5. Explain the difference between descriptive and inferential statistics. Give examples of each.

Q6. Discuss the ethical considerations in handling data, especially in the context of data privacy and bias.

Q7. What are the advantages and disadvantages of using surveys as a method of data collection?

Q8. Calculate the Pearson correlation coefficient for the following dataset:

X: 25, 30, 35, 40, 45

Y: 12, 15, 20, 25, 30

Q9. Explain the concept of data preprocessing and discuss its significance in data analysis.

Q10. What are some common data visualization tools and techniques used in data handling? Provide examples of each.

Comment

Article Tags:

Explore

Basic Arithmetic

Algebra

Geometry

Trigonometry & Vector Algebra

Calculus

Probability and Statistics

Practice

Courses

URL: https://www.geeksforgeeks.org/maths/practice-questions-on-data-handling/

⇱ Practice Questions on Data Handling - GeeksforGeeks