![]() |
VOOZH | about |
In this article, we will see some examples of non-linear regression in machine learning that are generally used in regression analysis, the reason being that most of the real-world data follow highly complex and non-linear relationships between the dependent and independent variables.
Table of Content
Nonlinear regression refers to a broader category of regression models where the relationship between the dependent variable and the independent variables is not assumed to be linear. If the underlying pattern in the data exhibits a curve, whether it's exponential growth, decay, logarithmic, or any other non-linear form, fitting a nonlinear regression model can provide a more accurate representation of the relationship. This is because in linear regression it is pre-assumed that the data is linear.
A nonlinear regression model can be expressed as:
Where,
Many different regressions exist and can be used to fit whatever the dataset looks like such as quadratic, cubic regression, and so on to infinite degrees according to our requirement.
These assumptions are similar to those in linear regression but may have nuanced interpretations due to the nonlinearity of the model. Here are the key assumptions in nonlinear regression:
There are two main types of Non Linear regression in Machine Learning:
Nonlinear regression encompasses various types of models that capture relationships between variables in a nonlinear manner. Here are some common types:
Polynomial regression is a type of nonlinear regression that fits a polynomial function to the data. The general form of a polynomial regression model is:
where,
Exponential regression is a type of nonlinear regression that fits an exponential function to the data. The general form of an exponential regression model is:
where,
Logarithmic regression is a type of nonlinear regression that fits a logarithmic function to the data. The general form of a logarithmic regression model is:
where,
Power regression is a type of nonlinear regression that fits a power function to the data. The general form of a power regression model is:
where,
Generalized additive models (GAMs) are a type of nonlinear regression that combines multiple linear models to model a complex relationship between variables. The general form of a GAM is:
where,
f1(x1), f2(x2), ..., fn(xn) - smooth functions of the independent variablesThe Gauss-Newton algorithm is an iterative optimization method designed for minimizing the sum of squared differences between observed and predicted values in nonlinear least squares regression. It iteratively updates parameter estimates by moving in the direction of the gradient of the objective function, leveraging the Jacobian matrix and the residuals.
The Gradient Descent algorithm is a widely used iterative optimization technique for finding the minimum of a function. In the context of nonlinear regression, it updates parameter estimates by iteratively moving towards the direction of the steepest decrease in the objective function, with the learning rate controlling the step size.
The Levenberg-Marquardt algorithm is a modification of the Gauss-Newton algorithm that introduces a damping parameter to enhance robustness. It dynamically adjusts the step size during iterations by combining the advantages of Gauss-Newton and gradient descent methods, providing a versatile approach for solving nonlinear least squares problems.
Evaluating the performance of a nonlinear regression model is crucial to ensure it accurately represents the underlying relationship between the independent and dependent variables.
There are a number of different metrics that can be used to evaluate non-linear regression models, but some common metrics are:
Non-linear regression algorithms work by iteratively adjusting the parameters of a non-linear function to minimize the error between the predicted values of the dependent variable and the actual values. The specific function used depends on the nature of the relationship between the variables, and there are many different types of non-linear functions that can be used.
Here we are implementing Non-Linear Regression using Python:
Importing all the necessary libraries:
Importing and reading the dataset: Dataset Link
Output:
Year Value
0 1960 5.918412e+10
1 1961 4.955705e+10
2 1962 4.668518e+10
3 1963 5.009730e+10
4 1964 5.906225e+10
Creates a scatter plot of the Year (independent variable) vs. Value (dependent variable).
Output:
👁 Screenshot-2023-12-19-101503
Representing a simple logistic model curve over a range of independent variable values.
Output:
👁 Screenshot-2023-12-19-101547
Beta_1 (slope) and Beta_2 (intercept).Beta_1 and Beta_2 based on intuition or estimation.Plots the predicted values (scaled up by 15000000000000 for better visibility) compared to the actual data points.
Output:
👁 Screenshot-2023-12-19-101359
Year and Value by their respective maximum values to scale them between 0 and 1.curve_fit function from scipy.optimize to find the best-fitting parameters for the sigmoid function based on the normalized data.popt) and their covariance matrix (pcov).Beta_1 and Beta_2 found by the curve fitting.Output:
Beta_1 = 690.451709, Beta_2 = 0.997207
Year values within the original data range for plotting the fitted curve.Year range using the optimized parameters from step 10.Output:
👁 Screenshot-2023-12-19-101124
curve_fit function again to find the best-fitting parameters for the sigmoid function using only the training data.Output:
Mean absolute error: 0.05
Residual sum of squares (MSE): 0.00
R2-score: 0.88
For a model to be considered non-linear, Y hat must be a non-linear function of the parameters Theta, not necessarily the features X. When it comes to the non-linear equation, it can be the shape of exponential, logarithmic, logistic, or many other types.
As you can see in all of these equations, the change of Y hat depends on changes in the parameters Theta, not necessarily on X only. That is, in non-linear regression, a model is non-linear by parameters.
Feature | Linear Regression | Non Linear Regression |
|---|---|---|
| Relationship between variables | Assumes a linear relationship between the independent and dependent variables | Allows for non-linear relationships between the independent and dependent variables |
| Model complexity | Simpler model with fewer parameters | More complex model with more parameters |
| Interpretability | Highly interpretable due to the linear relationship | Less interpretable due to the non-linear relationship |
| Overfitting susceptibility | More susceptible to overfitting due to its simplicity | Less susceptible to overfitting due to its ability to capture complex relationships |
| Flexibility | Requires large datasets to accurately estimate the linear relationship | Can work with smaller datasets due to its flexibility |
| Applications | Suitable for predicting continuous target variables when the relationship is linear | Suitable for predicting continuous target variables when the relationship is non-linear |
| Examples | Predicting house prices based on size and location | Predicting customer churn based on behavioral patterns |
As we know that most of the real-world data is non-linear and hence non-linear regression techniques are far better than linear regression techniques. Non-Linear regression techniques help to get a robust model whose predictions are reliable and as per the trend followed by the data in history. Tasks related to exponential growth or decay of a population, financial forecasting, and logistic pricing model were all successfully accomplished by the Non-Linear Regression techniques.
Non-linear regression using Python is a powerful tool for modeling relationships that are not linear in nature. It is used in a wide variety of fields, including economics, finance, medicine, science, and engineering.
Non-linear regression can be a complex topic, but it is worth learning if you need to model relationships that are not linear in nature.