Logistic Regression is a supervised learning algorithm used for classification problems. To measure how well the model is performing, we use a cost function that evaluates how closely the predicted probabilities match the actual class labels. In Logistic Regression, the cost function is based on log loss (cross-entropy loss) instead of mean squared error.
- It measures the error between the predicted probability and the actual class label (0 or 1).
- Instead of a straight line (like Linear Regression), Logistic Regression works with probabilities between 0 and 1 using the sigmoid function.
- The cost function penalizes wrong predictions more heavily when the model is confident but wrong.
The cost function is defined as:
Where:
- : Predicted probability using sigmoid
- : Actual value (0 or 1)
For all training examples, the cost function (log loss) becomes:
Why Not Use Mean Squared Error (MSE)
- MSE works well for regression, but in Logistic Regression it creates a non-convex curve (multiple local minima).
- Log loss ensures a convex cost function, making optimization with Gradient Descent easier and guaranteeing a global minimum.
Implementing Logistic Regression Cost Function in Python
Here in this code demonstrates how Logistic Regression computes predicted probabilities using the sigmoid function and evaluates model performance using the log loss (binary cross-entropy) cost function. It shows how prediction confidence affects the overall error in a simple, numerically stable way.
Output:
Predicted Probabilities: [0.52497919 0.549834 0.57444252]
Cost Function Value: 0.6322969246412298
- Predicted Probabilities: These are the model’s estimated chances of the positive class, all close to 0.5, indicating low confidence.
- Cost Function Value: The log loss of 0.63 shows moderate error, meaning the model’s predictions need improvement.
Related Articles