The Likelihood Ratio Test (LRT) is a fundamental statistical technique used to compare the goodness of fit between two competing models — a null model (simpler model) and an alternative model (more complex model). The LRT helps determine whether the more complex model provides a statistically significant improvement over the simpler model.
The core objective of the LRT is to evaluate whether introducing additional parameters in a model significantly improves the fit of the data.
- Null Hypothesis (H0): The simpler (reduced) model fits the data adequately.
- Alternative Hypothesis (H1): The more complex (full) model provides a better fit.
Mathematics Behind Likelihood Ratio Test
1. Likelihood Function
The likelihood function for a set of observations X = {x1, x2, …, xn} and a parameter vector θ is given by:
Where:
- L(θ) is the likelihood function.
- X is the observed data.
- θ is the parameter(s) of the model.
2. Likelihood Ratio Statistic
The Likelihood Ratio (LR) statistic compares the likelihood of the reduced model (H0) to that of the full model (H1):
Where:
- L(θ0) is the maximum likelihood under the null hypothesis (H0).
- L() is the maximum likelihood under the alternative hypothesis (H1).
3. Log-Likelihood Ratio Test Statistic
The log-transformed likelihood ratio is:
4. Chi-Square Distribution and Degrees of Freedom
Under the null hypothesis, the test statistic D follows a chi-square distribution:
Hypothesis Testing Using LRT
To decide whether to reject H0, compare the LRT statistic D with the critical value from the chi-square distribution at a given significance level (α):
Nested Models in LRT
LRT is applicable only when the models are nested, meaning the null model is a special case of the alternative model.
- Nested Models: Model 1 is a subset of Model 2.
- Non-Nested Models: If models are not nested, alternative methods such as AIC or BIC should be used.
Likelihood Ratio Test in Logistic Regression
LRT is extensively used in logistic regression to compare models with different sets of predictor variables.
- Model 1: Basic model with a constant term.
- Model 2: Model with one or more predictor variables.
- LRT: Determines whether the predictors significantly improve the model.
LRT for Generalized Linear Models (GLMs)
LRT can be applied to Generalized Linear Models (GLMs) such as Poisson, Gamma, and Binomial models.
- Null Model: Reduced model with fewer predictors.
- Full Model: Model with all predictors.
- Decision Rule: Reject the null if the likelihood ratio test statistic exceeds the critical value.
Wald Test vs. Likelihood Ratio Test
The Wald Test and the Likelihood Ratio Test (LRT) are both used for hypothesis testing but differ in how they approach the problem:
- Wald Test: Evaluates whether the estimated parameters deviate significantly from their hypothesized values.
- LRT: Compares the fit of two models by looking at the ratio of their likelihoods.
When to Use:
- LRT is often preferred when models are nested and you need to compare goodness of fit.
- Wald Test is easier to compute and often used for individual parameter testing.
Score Test (Lagrange Multiplier Test) vs. LRT
The Score Test (also known as the Lagrange Multiplier Test) compares the gradient of the likelihood function at the null hypothesis.
- LRT: Compares the likelihood of the full and reduced models.
- Score Test: Uses the gradient of the likelihood to assess the adequacy of the null model.
Difference:
- LRT requires fitting both models, while the Score Test only needs the null model.
Akaike Information Criterion (AIC) vs. Likelihood Ratio Test
Both Akaike Information Criterion (AIC) and LRT assess model performance but use different approaches:
- LRT: Compares nested models using hypothesis testing.
- AIC: Penalizes models with more parameters to prevent overfitting.
Advantages of Likelihood Ratio Test
- Ideal for comparing nested models.
- Robust and asymptotically efficient.
- Flexible enough to handle various types of models.
Limitations of Likelihood Ratio Test
- Requires that models are nested.
- Sensitive to model assumptions.
- May overfit if the alternative model has too many parameters.