![]() |
VOOZH | about |
Linear regression is based on the assumption that the underlying data is normally distributed and that all relevant predictor variables have a linear relationship with the outcome. But In the real world, this is not always possible, it will follows these assumptions, Bayesian regression could be the better choice.
Bayesian regression employs prior belief or knowledge about the data to "learn" more about it and create more accurate predictions. It also takes into account the data's uncertainty and leverages prior knowledge to provide more precise estimates of the data. As a result, it is an ideal choice when the data is complex or ambiguous.
Bayesian regression leverages Bayes' theorem to estimate the parameters of a linear model, incorporating both observed data and prior beliefs about the parameters. Unlike ordinary least squares (OLS) regression, which provides point estimates, Bayesian regression produces probability distributions over possible parameter values, offering a measure of uncertainty in predictions.
The important concepts in Bayesian Regression are as follows:
Bayes’ theorem describes how prior knowledge is updated with new data:
where:
The likelihood function represents the probability of the observed data given certain parameter values. Assuming normal errors, the relationship between independent variables X and target variable Y is:
where follows a normal distribution variance .
Bayesian regression offers several advantages over traditional regression techniques:
For a dataset with n samples, the linear relationship is:
where w are regression coefficients and .
The probability density function of Y given X is:
For N observations:
which simplifies to:
Taking the logarithm of the likelihood function:
We define precision β as:
Substituting into the likelihood function:
The negative log-likelihood is:
Taking the logarithm of the posterior:
Substituting the expressions:
Minimizing this expression gives the maximum posterior estimate, which is equivalent to ridge regression.
Bayesian regression provides a probabilistic framework for linear regression by incorporating prior knowledge. Instead of estimating a single set of parameters, we obtain a distribution over possible parameters, which enhances robustness in situations with limited data or multicollinearity.
| Feature | Traditional Linear Regression | Bayesian Regression |
|---|---|---|
| Assumptions | Data follows a normal distribution; no prior information | Incorporates prior distributions and uncertainty |
| Estimates | Point estimates of parameters | Probability distributions over parameters |
| Flexibility | Limited; assumes strict linearity | Highly flexible; can incorporate non-linearity |
| Data Requirement | Requires large datasets for reliable estimates | Works well with small datasets |
| Uncertainty Handling | Does not quantify uncertainty | Provides uncertainty estimates via posterior distributions |
It utilizes Stochastic Variational Inference (SVI) to approximate the posterior distribution of parameters (slope, intercept, and noise variance) in a Bayesian linear regression model. The Adam optimizer is used to minimize the Evidence Lower Bound (ELBO), making the inference computationally efficient.
First, we import the necessary Python libraries for performing Bayesian regression using torch, pyro, SVI, Trace_ELBO, predictive, Adam, and matplotlib and seaborn.
We create synthetic data for linear regression:
This function approximates the posterior distribution of the parameters:
pyro.param to learn mean (loc) and standard deviation (scale) for each parameter.Step 5: Train the Model using SVI
Output
Iteration 100/1000 - Loss: 857.5039891600609
Iteration 200/1000 - Loss: 76392.14724761248
Iteration 300/1000 - Loss: 4466.2376717329025
Iteration 400/1000 - Loss: 70616.07956075668
Iteration 500/1000 - Loss: 7564.8086141347885
Iteration 600/1000 - Loss: 86843.96660631895
Iteration 700/1000 - Loss: 155.43085688352585
Iteration 800/1000 - Loss: 248.03456103801727
Iteration 900/1000 - Loss: 353587.08260041475
Iteration 1000/1000 - Loss: 253.0774005651474
Predictive function samples from the posterior using the trained guide.Output
Estimated Parameters:
Estimated Slope: 1.0719
Estimated Intercept: 1.1454
Estimated Sigma: 2.2641
We plot the distributions of the inferred parameters: slope, intercept, and sigma using seaborn
Output
In this implementation, we utilize Bayesian Linear Regression with Markov Chain Monte Carlo (MCMC) sampling using PyMC3, allowing for a probabilistic interpretation of regression parameters and their uncertainties.
Here, we import the required libraries for the task. These libraries include os, pytensor, pymc, numpy, and matplotlib.
PyMC uses PyTensor (formerly Theano) as the backend for running computations. We clear the cache to avoid any potential issues with stale compiled code
We combine setting the random seed and generating synthetic data in this step. The random seed ensures reproducibility, and the synthetic data is generated for the linear regression model.
Now, we define the Bayesian model using PyMC. Here, we specify the priors for the model parameters (slope, intercept, and sigma), and the likelihood function for the observed data.
After defining the model, we sample from the posterior using MCMC (Markov Chain Monte Carlo). The pm.sample() function draws samples from the posterior distributions of the model parameters.
draws=2000 for the number of samples, tune=1000 for tuning steps, and cores=1 to use a single core for the sampling process.Finally, we plot the posterior distributions of the parameters (slope, intercept, and sigma) to visualize the uncertainty in their estimates. pm.plot_posterior()plots the distributions, showing the most likely values for each parameter.
Output