Causal analysis is a technique used to understand why something happens by identifying cause–effect relationships. It helps analyze how changes in one variable affect another, supporting better decision-making across various fields. Causal analysis helps answer key questions such as:
Why did something happen?
What are its consequences?
How can it be improved or prevented?
For example, Increasing the price of a product may lead to a decrease in its demand. Here, price is the cause and demand is the effect. By analyzing data, we can determine whether this relationship truly exists and how strong the impact is.
Correlation & Causation
Correlation: Refers to a situation where two variables change together, but one does not necessarily cause the other.
Values of both variables increase or decrease together
The relationship may happen due to another hidden factor
Cannot be used to confirm cause–effect
Causation: Refers to a situation where a change in one variable directly causes a change in another.
One variable directly affects the other
Change in one leads to a predictable change in the other
Can be used to explain and predict outcomes
Key Concepts
Cause and Effect: Cause is the factor that leads to a change, while effect is the result of that change.
Confounding Variable: It is a third factor that influences both the cause and the effect, making it seem like one is affecting another.
Mediator Variable: This explains how or why a cause leads to an effect by acting as a link between the two.
Moderator Variable: It affects the strength or direction of the relationship between cause and effect under different conditions.
Intervention: An action taken to change a variable in order to observe its impact on the outcome.
Counterfactual: It represents what would have happened if a different action or condition had occurred instead of the actual one.
Steps to Perform
Defining the Problem: Clearly identify the issue to be analyzed, as this sets the foundation for the process.
Identifying Variables: Breaking the problem into key variables that can influence the outcome.
Collection of Data: Gathering relevant and reliable data using methods like surveys, experiments, or existing datasets.
Establishing Relationships: Determine how variables are related using appropriate tools or methods.
Distinguishing Correlation from Causation: Ensure that relationships are causal and not just coincidental.
Considering Confounding Variables: Identifying other factors that may influence the relationship and affect results.
Interpreting the Results: Analyzing the findings to draw meaningful conclusions and support decision-making.
Common Methods
Experimental Method: Involves manipulating one variable and observing its effect on another under controlled conditions, for e.g., testing how a new medicine affects patients
Quasi-Experimental Method: Similar to experiments but uses existing groups instead of randomly created ones, for e.g., comparing performance of students from different schools
Observational Method: Data is studied as it is, without making any changes, to understand relationships like, analyzing real-world data trends.
Regression Analysis: A statistical method used to measure how much one variable affects another, for e.g., how price impacts demand
Causal Graphs (DAGs): Diagrams that show how different variables are connected, helping to understand cause–effect relationships and hidden factors.
Implementation
Suppose we want to understand how a customer’s total bill influences the tip amount. The goal is to analyze whether an increase in total bill leads to a higher tip using causal analysis.
The dataset used is publicly available and contains information about restaurant bills and tips. It can be downloded by clicking here.