Probability and Statistics

Last Updated : 8 Nov, 2025

Probability and Statistics are important topics when it comes to studying numbers and data. Probability helps us figure out how likely things are to happen, like guessing if it will rain. On the other hand, Statistics involves collecting, analyzing, and interpreting data to draw meaningful conclusions, like looking at numbers to learn useful things. Together, they help us make smart decisions and see patterns in the information around us.

👁 probability_statistics

Probability

Probability is a measure of the likelihood or chance of an event occurring.

It is expressed as a number between 0 and 1, where 0 indicates an impossible event, and 1 signifies a sure event.
The probability of an event is calculated by dividing the number of favorable outcomes by the total number of possible outcomes.

In simple terms, it quantifies the likelihood of an outcome in a given set of circumstances, providing a basis for making informed predictions and decisions in various fields, including mathematics, statistics, and everyday life.

Statistics

Statistics is the branch of mathematics that involves the collection, analysis, interpretation, presentation, and organization of data.

It provides methods for making inferences about populations based on samples.
It encompasses various techniques, including descriptive statistics to summarize data and inferential statistics to make predictions or test hypotheses about larger populations.

Statistics helps to quantify uncertainty and variation in data, enabling researchers, analysts, and decision-makers to draw meaningful conclusions and make informed decisions.

Terms Related to Probability and Statistics

Random Experiment: An experiment is a set of steps that gives clear results. A random experiment is one where the exact outcome cannot be predicted.
Outcome: Outcome means any possible result in a group of results, called a sample space, noted as S. For example, when you flip a fair coin, the sample space is {heads, tails}.
Sample Space: Sample space is the collection of all possible outcomes in an experiment. Like in a coin flip, the sample space is {heads, tails}.
Event: An event is any subset of the sample space. If an event A occurs, it means one of the outcomes belongs to A. For instance, if event A is rolling an even number on a fair six-sided die, getting 2, 4, or 6 means event A occurred. If you get 1, 3, or 5, event A did not happen.
Trial: A trial is each time you experiment, like flipping a coin. In the coin-flipping experiment, each flip of the coin is a trial.
Mean: The mean of a random variable is the average of all possible values it can take, weighted by their probabilities.
Expected Value: The expected value is the mean of a random variable. For instance, if we roll a six-sided die, the expected value is the average of all possible outcomes, which is 3.5.

Probability and Statistics Formulas

Some of the common formulas of Probability and Statistics are discussed below:

Probability Formulas

Probability is the likelihood of an event occurring and is calculated using the following formula:

P(A) = Number of Favourable Outcomes / Total Number of Possible Outcomes

Where:

P(A) is the probability of event A.
Number of Favorable Outcomes is the count of outcomes where event A occurs.
Total Number of Possible Outcomes is the count of all possible outcomes.

In simple terms, probability is the ratio of successful outcomes to all possible outcomes. The result is a number between 0 (impossible event) and 1 (certain event). It can also be expressed as a percentage by multiplying the result by 100.

For example, if you want to find the probability of rolling a 4 on a six-sided die, there is 1 favorable outcome (rolling a 4) out of 6 possible outcomes (1, 2, 3, 4, 5, 6). Therefore,
P(rolling a 4)= 1/6

Addition Rule Formula

The addition rule of probability is used to find the probability that at least one of two events occurs.
If events A and B are mutually exclusive (they cannot happen at the same at same time), then the probability of either event A or event B occurring is:

P(A or B) = P(A ∪ B) = P(A) + P(B) - P(A ∩ B) ( If A and B are not mutually exclusive events)
where P(A ∩ B) is the probability of A and B occurring.
P(A or B) = P(A ∪ B) = P(A) + P(B), ( If A and B are mutually exclusive events)

Multiplication Rule Formula

The multiplication rule of probability is used to find the probability of two events occurring together.

If events A and B are independent(they do not affect each other), then:
P(A ∩ B)=P(A)×P(B)
If events A and B are dependent( the occurrence of A affects the occurrence of B), then:
P(A ∩ B)=P(A)×P(B∣A)

Here, P(B∣A) is the likelihood of event B happening when event A has already occurred.

Bayes' Rule

Bayes' Rule is a formula used to update probabilities based on new evidence. It calculates the probability of an event A happening given the occurrence of another event B. The formula is as follows:

Here:

P(A∣B) is the probability of event A occurring given that event B has occurred.
P(B∣A) is the probability of event B occurring given that event A has occurred.
P(A) and P(B) are the probabilities of events A and B occurring, respectively.

Some Other Rules and Formulas

Probability is between 0 and 1: The likelihood of an event ranges from 0 (impossible) to 1 (certain). A probability of 0.5 means an equal chance.
The sum of all probabilities is 1: When you consider all possible outcomes of an event, the total probability is 1. If one outcome has a probability of 0.3, the other outcome (or outcomes) must add up to 0.7 to make 1.
Complement Rule: The probability of an event happening (P(A)) plus the probability of it not happening (P(not A)) equals 1. P(not A) is often written as 1−P(A).

Statistics Formulas

Some of the common formulas for statistics are discussed below:

Mean

The mean is the average of a set of numbers. To find the mean, add up all the numbers in a dataset and then divide by the total number of values.

Mean = Sum of all values / Total number of values

Where,

Is the mean,
∑xi is the sum of all terms in the data set,
N is the total number of terms.

Median

The median is the middle value in a dataset when it's arranged in ascending or descending order. If there's an even number of values, the median is the average of the two middle numbers.

Median (Odd n)

Median = Value at ^th position

Median (Even n)

Where,

N is the number of values in the data set

Mode

The mode is the value that appears most frequently in a dataset. A dataset may have one mode (unimodal), more than one mode (multimodal), or no mode at all.

Variance

Variance measures how spread out the values in a dataset are. It's calculated by finding the average of the squared differences between each value and the mean.

Variance= ∑(Each value−Mean) ² / Total number of values
OR

Where,

σ² is the variance
∑(xi− 2 is² the sum of squared differences between each term and the mean
N is the total number of terms.

Standard Deviation

Standard deviation is the square root of the variance. It provides a more interpretable measure of how spread out the values are in comparison to the mean.

Standard Deviation = √Variance
OR

Where,

xi represents each term in the data set
σ²is the variance,
√σ² is the standard deviation.
Is the mean

Topics under Probability and Statistics

Some important topics under both Probability and Statistics are discussed below:

Events in Probability

The various types of events in probability are:

Simple Event

A simple event is when an outcome has just one possibility.

Example: In a coin flip, getting heads is a simple event, and getting tails is another.

P(Simple Event) = 1 / Total Possible Outcomes

Compound Event

A compound event involves two or more simple events.

For Example, flipping a coin twice and getting heads both times.

P(Compound Event) = P(Event 1) × P(Event 2)

Independent Event

Independent events are events where the outcome of one does not affect the outcome of the other.
Example: Each flip of fair coin is independent- getting heads once doesn't change the chance of getting head again.

Dependent Event

Dependent events are those where outcomes of another.
Example: Drawing a marble from a bag without replacement- the second draw is affected by the first.

Complementary Event

A complementary event is the opposite of a given event. The complement of event A(denoted as A^') includes all outcomes not in A.
Example: IF an event of rolling an even number on a six-sided die, then A^' is rolling an odd number.

P(Not A) = 1−P(A)

Probability Distribution

A probability distribution describes how the probabilities of different outcomes are spread across the possible values of a random variable. It provides a comprehensive view of the likelihood of each possible outcome, helping to understand the uncertainty associated with random events. There are two main types of probability distributions:

Probability Functions

Probability functions provide mathematical representations of the probabilities associated with different values of a random variable. Two common types are Probability Mass Functions (PMFs) for discrete variables and Probability Density Functions (PDFs) for continuous variables.

Statistics Topics

Some of the key topics of statistics are:

Descriptive Statistics

Descriptive statistics is a branch of statistics focused on summarizing data, presenting it in various forms like graphs or tables. It involves using summary statistics to provide a clear understanding of the data. A descriptive statistic serves as a condensed representation of data. Following are the examples of descriptive statistics given below.

Measures of Central Tendency

Central Tendency of a set of data is measured by the following methods

Mean: The average of a set of values. Add up all values and divide by the number of values.
Median: The middle value when data is arranged in order.
Mode: The most frequently occurring value in a dataset.

Example: For test scores of 80, 85, 90, 92, and 95, the mean is (80+85+90+92+95)/5 = 88, the median is 90, and the mode is not applicable as there is no repeated value.

Measures of Variability

Standard Deviation: Indicates how spread out the values are from the mean.

Variance: The average of the squared differences from the mean.

Example: In two sets of scores, 70, 75, 80, 85, and 90, and 60, 65, 70, 75, and 80, both have a mean of 80, but the second set has a higher variance, showing more variability.

Inferential Statistics

In practical situations, collecting data from entire populations is often challenging. Descriptive statistics provide a solution by summarizing and organizing available data to offer insights. For instance, calculating the mean (average) and standard deviation from a sample can provide a snapshot of the central tendency and variability in a dataset.

However, when population-scale data collection is impractical, inferential statistics come into play. They involve concluding entire populations based on samples. For example, if estimating the mean score of all U.S. high school students on the AP Physics exam is too extensive, inferential statistics enable drawing reliable conclusions from a manageable sample. This approach facilitates informed decision-making even when exhaustive data collection is unfeasible.

Data Representations

Data representation involves the presentation of information in a meaningful and understandable manner. In statistics, this is crucial for analyzing and interpreting data effectively. Common methods of data representation include:

Sampling Techniques

Methods of sampling are used to select a subset of individuals or items from a larger population for the purpose of making inferences about the population. Different sampling techniques are employed based on the nature of the study and the characteristics of the population. Here are some common sampling techniques:

Probability and Statistics for Engineering Mathematics

Probability and Statistics form a crucial part of engineering mathematics, offering a foundation for making informed decisions and solving complex engineering problems. Here's a brief overview of how these mathematical fields apply to engineering:

Probability in Engineering

Risk Assessment and Safety Analysis: Engineers use probability to evaluate the risks associated with different engineering projects or processes, helping to design safer buildings, vehicles, and systems.
Quality Control and Reliability Engineering: Probability models help in assessing the reliability of components and systems, predicting failures, and improving product quality through rigorous testing protocols.
Signal Processing: In electrical and communication engineering, probability is used to analyze and filter signals, dealing with the randomness and noise in data transmission.
Decision Making under Uncertainty: Probability aids in making decisions when outcomes are uncertain, optimizing resources and strategies in situations with incomplete information.

Statistics in Engineering

Data Analysis and Interpretation: Engineers collect and analyze data to understand trends, draw conclusions, and support decision-making processes.
Experimental Design and Analysis: Statistical methods are used to design experiments, analyze results, and validate theories or models in fields ranging from material science to environmental engineering.
Process and Quality Improvement: Statistical tools like control charts and design of experiments (DoE) are pivotal in manufacturing and industrial engineering for process optimization and quality enhancement.
Predictive Modeling: Statistics support the creation of models to forecast future events or behaviors, critical in areas such as renewable energy, traffic flow management, and infrastructure development.

Probability and Statistics - Solved Examples

Example 1: Consider the following dataset: [5, 8, 2, 5, 3, 7, 9]. Calculate the mean, median, and mode.

Solution:

Mean =
= [5+8+2+5+3+7+9] / 7
⇒ 39/7 = 5.579
Median:
The number of values in data set is 7, which is odd n
by arranging the values in ascending order [2, 3, 5, 5, 7, 8, 9].
The median is the 4th value, which is 5.
Mode: The mode is 5, as it appears more frequently than any other number in the dataset.

Example 2: Given the dataset [12, 15, 18, 22, 25], calculate the variance and standard deviation.

Solution:

The given data set is [12, 15, 18, 22, 25]
Mean =
⇒ = sum of all values / total number of values
⇒ = (12+15+18+22+25) / 5
⇒ 92/5
⇒ 18.4
Now,
Variance = Variance= ∑(Each value−Mean) 2 / Total number of values
⇒ σ² = [(12−18.4)²+ (15−18.4)²+ (18−18.4)²+ (22−18.4)²+ (25−18.4)² ] / 5
⇒ [41.64 + 11.56 + 0.16 + 13.44 + 43.56] /5
⇒ 110.36 /5
⇒ 22.072
We know,
Standard deviation = √σ²
⇒ √22.072
√σ²= 4.69

Example 3: In a deck of cards, what is the probability of drawing a red card?

Solution:

Total number of cards in a deck = 52
Total number od Red cards in a deck = 26 (hearts + diamonds)
P(Red Card) = 52/26
⇒ P(Red Card) = 2/4
⇒ P (Red Card) = 1/2 or 0.5 or 50%

Example 4: A box contain 3 red balls, 2 green balls, and 5 blue balls. One ball is drawn at random. what is the probability the the ball drawn is red, green and not blue?

Solution:

Total number of balls = 3(Red) + 2( Green) + 5 (Blue) = 10
P(Red) = Number of red balls/ Total balls =3/10
P(Green) = Number of green balls/ Total balls = 2/10 = 1/5
p(Not Blue) = Number of non-blue balls(Red + Green)/Total balls = 3 + 2/10 = 5/10 = 1/2

Example 5: A company has 6 departments, and the number of employees in each department is as follows: 25, 30, 28, 32, 35,40. Find the population mean of number of employees.

Solution:

The population mean 𝛍 is calculated by :
𝛍 = Sum of the values in the population / Number of values in the population
𝛍 = 25 + 30 + 28 + 32 + 35 + 40 / 6 = 190/6 = 31.67

Practice Questions on Probability and Statistics

Problem 1: A bag contains 5 red marbles, 4 blue marbles, and 3 green marbles. What is the probability of randomly selecting a blue marble?

Problem 2: A survey is conducted on a sample of 100 people to estimate the average time spent daily on a mobile phone. The sample mean is 2.5 hours with a standard deviation of 1 hour. Calculate a 95% confidence interval for the population mean.

Problem 3: A fair six-sided die is rolled. What is the probability of rolling an even number or a number greater than 4?

Problem 4: Data Set: [8, 12, 15, 18, 10]. Calculate the variance and standard deviation.

Problem 5: Data Set: [10, 15, 12, 18, 15, 22, 20]. Find the mean, median, and mode of the given data set.

Quartile Formula
Measure of Dispersion
Normal Distribution
Mean, Median, and Mode

Comment

Article Tags:

Explore

Basic Arithmetic

Algebra

Geometry

Trigonometry & Vector Algebra

Calculus

Probability and Statistics

Practice

Courses

URL: https://www.geeksforgeeks.org/maths/probability-and-statistics/

⇱ Probability and Statistics - GeeksforGeeks

Probability and Statistics

Probability

Statistics

Terms Related to Probability and Statistics

Probability and Statistics Formulas

Probability Formulas

Addition Rule Formula

Multiplication Rule Formula

Bayes' Rule

Some Other Rules and Formulas

Statistics Formulas

Mean

Median

Mode

Variance

Standard Deviation

Topics under Probability and Statistics

Events in Probability

Probability Distribution

Probability Functions

Statistics Topics

Descriptive Statistics

Measures of Central Tendency

Measures of Variability

Inferential Statistics

Data Representations

Sampling Techniques

Probability and Statistics for Engineering Mathematics

Probability in Engineering

Statistics in Engineering

Probability and Statistics - Solved Examples

Practice Questions on Probability and Statistics

Related Article:

Explore