The glmnet package in R is used to build linear regression models with special techniques called Lasso (L1) and Ridge (L2). These techniques add a small penalty to the model to avoid making it too complex which helps prevent overfitting and makes the model work better on new data.
Regularized Regression
A type of regression that adds a penalty term to the cost function to reduce overfitting.
Lasso Regression: A type of regularized regression that adds an L1 penalty term to the cost function.
Ridge Regression: A type of regularized regression that includes an L1 penalty term in the cost function.
Elastic Net Regression: A type of regularized regression that includes both L1 and L2 penalty term in the cost function.
Syntax
glmnet(X, y, family = "gaussian", alpha = 1, lambda = NULL)
The main function in the glmnet package is glmnet() which fits a regularized generalized linear model. The function accepts a number of important arguments:
x: The matrix of predictor variables.
y: The response variable.
alpha: Declares the type of regularization (Lasso: alpha = 1, Ridge: alpha = 0, Elastic Net: 0 < alpha < 1).
lambda: Regularization parameter that affects the strength of the penalty.
Lasso regression helps prevent overfitting by shrinking less important feature coefficients using an L1 penalty. We will now implement it using the glmnet package.
1. Installing and loading the glmnet package
We first install and load the glmnet package which provides tools for regularized regression.
install.packages("glmnet"): Installs the package from CRAN.
library(glmnet): Loads the package into the R session so we can use its functions.
2. Loading and preparing the data
We use the built-in mtcars dataset and split it into predictor and response variables.
data(mtcars): Loads the dataset into the environment.
X: A matrix of predictor variables (all columns except the first).
y: A response variable (miles per gallon or mpg, the first column).
as.matrix(): Converts the predictors to a matrix which is required by glmnet().
3. Fitting the Lasso regression model
We now fit the Lasso model using the glmnet() function.
glmnet(): Fits a regularized linear model.
family = "gaussian": Specifies linear regression.
alpha = 1: Sets the model type to Lasso regression.
summary(): Displays a summary of the fitted model.