Ridge Regressor using sklearn

Last Updated : 24 Mar, 2026

Ridge Regression is a technique in machine learning that helps prevent overfitting by adding a regularization term to the linear regression model. Using Scikit-Learn, we can implement Ridge Regression to prevent overfitting in linear models.

Helps prevent overfitting by avoiding fitting noise instead of the actual trend.
It is a regularised version of linear regression.
It helps handle highly correlated features by shrinking coefficients, resulting in more stable and accurate predictions.
Introduces the alpha parameter to control the strength of regularisation.
Penalises large coefficients to improve model generalisation.

Ridge Regression Loss Function

Ridge Regression aims to minimise both the prediction error and the size of the model coefficients, which is expressed by its cost function:

where:

: The actual value of the target variable for the i^th data point.
: Predicted value of the target variable.
: Regularisation parameter that controls the strength of the penalty on large weights.
: The overall cost (loss) that Ridge Regression tries to minimise.

The first term represents the standard linear regression cost, measuring the mean squared error between predicted and actual values. The second term is the L2 regularization which penalises large coefficients to improve generalisation and prevent overfitting.

How Alpha Controls Regularisation

If =0, Ridge Regression reduces to ordinary linear regression.
A larger increases regularisation strength, shrinking coefficients toward zero.

Ridge Regression is sensitive to the scale of features, so input features should be standardized.

Choosing the Right

Selecting an appropriate is important for balancing bias and variance. Common approaches include:

Cross-Validation: Test different values and choose the one with the best performance on unseen data.
Grid Search: Try a predefined set of values and select the one that minimizes prediction error.
Error Metrics: Evaluate using test data rather than training data to avoid overfitting.

Step By Step Implementation

Here we implement Ridge Regression on the California housing dataset.

Step 1: Import Required Libraries

Import essential libraries like

NumPy for numerical operations
Matplotlib for visualization
Scikit learn modules for dataset handling, preprocessing, regression modeling and evaluation

Step 2: Load and Split Dataset

Fetch the California housing dataset, assign features (X) and target (y) and split into training and test sets for model training and evaluation.

Step 3: Feature Scaling

Standardize the training and test features using StandardScaler to ensure all variables are on the same scale, which improves regression model performance.

Step 4: Train Ridge Regression with Cross-Validation

Initialize RidgeCV with multiple alpha values and 5-fold cross-validation, then fit it on the scaled training data to find the optimal regularization strength.

Output:

👁 Screenshot-2026-03-02-164002

Model Trained

Step 5: Make Predictions and Evaluate Model

Predict housing values on the test set and evaluate model performance using the R² score.

Output:

Best alpha selected: 10.0
Model score (R^2): 0.595944060491304

Step 6: Visualize Predictions

Plot the predicted vs actual housing values with a best-fit line to visually assess how well the Ridge Regression model fits the data.

Output:

👁 Screenshot-2026-03-02-164324

Best fit line

Download code from here.

Limitations

It retains all features in the model, so it doesn’t help identify which predictors are truly important. This can be confusing in datasets with many features.
A very high regularization parameter (alpha) can oversimplify the model, causing underfitting and missing important patterns in the data.
If you expect many coefficients to be exactly zero (sparse solution), Ridge regression is not ideal, unlike Lasso regression which can perform feature selection.
Although regularization reduces variance, Ridge regression can still be influenced by extreme outliers, which can affect the model’s predictions and coefficients.

Ridge Regression in Python
Ridge Regression in R
Ridge regression Vs Lasso Regression

Comment

Article Tags:

Explore

Machine Learning Basics

Python for Machine Learning

Feature Engineering

Supervised Learning

Unsupervised Learning

Model Evaluation and Tuning

Advanced Techniques

Machine Learning Practice

Courses

URL: https://www.geeksforgeeks.org/machine-learning/ml-ridge-regressor-using-sklearn/