Essential Metrics for Model Assessment: TP, TN, FP, FN in Machine Learning

Last Updated : 23 Jul, 2025

This performance evaluation criteria task is an important step in both machine learning and data sciences. Hallmark measures including True Positive (TP), True Negative (TN), False Positive (FP) and False Negative (FN) are very useful in quantifying the effectiveness of the model that has been developed. Several metrics are available, and from the popular python library scikit-learn, arguably the best library for data scientists, we can obtain these metrics to assess the accuracy of model predictions. These metrics help data scientists understand how well their models can make accurate predictions, then optimise them for different or specific decisions, consequently improving the decision-making capability within diverse fields.

Definitions of Metrics

Understanding True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN) is crucial for evaluating the performance of classification models. These metrics are derived from the confusion matrix and provide detailed insight into the classification accuracy.

True Positive (TP):

Definition: The number of instances where the model correctly predicted the positive class.
Example: In a medical test scenario, if the model correctly identifies patients with a disease, these instances are considered true positives.

True Negative (TN):

Definition: The number of instances where the model correctly predicted the negative class.
Example: If the model correctly identifies healthy patients who do not have the disease, these instances are true negatives.

False Positive (FP):

Definition: The number of instances where the model incorrectly predicted the positive class.
Example: If the model incorrectly identifies healthy patients as having the disease, these instances are false positives. This is also known as a Type I error.

False Negative (FN):

Definition: The number of instances where the model incorrectly predicted the negative class.
Example: If the model fails to identify patients with the disease and classifies them as healthy, these instances are false negatives. This is also known as a Type II error.

Confusion Matrix

A confusion matrix is a table used to describe the performance of a classification model on a set of test data for which the true values are known. It allows visualization of the performance of an algorithm.

Here is the structure of a confusion matrix:

	Predicted Negative (0)	Predicted Positive (1)
Actual Negative (0)	True Negative (TN)	False Positive (FP)
Actual Positive (1)	False Negative (FN)	True Positive (TP)

Diagonal elements (TN and TP): Indicate the number of correct predictions.
Off-diagonal elements (FP and FN): Indicate the number of incorrect predictions.

Methods for Obtaining Metrics in Scikit Learn

Scikit-learn provides a variety of methods to compute the metrics essential for evaluating classification models. These methods revolve around the confusion_matrix function, which is the cornerstone for deriving True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN). Below are the methods and relevant functions for obtaining these metrics:

Confusion Matrix

The confusion_matrix function computes the confusion matrix from the true labels and the predicted labels.

Output:

[[4 1]
 [1 4]]

Extracting TP, TN, FP, and FN

After computing the confusion matrix, you can extract TP, TN, FP, and FN by using array indexing or the ravel method.

Output:

True Positives (TP): 4
True Negatives (TN): 4
False Positives (FP): 1
False Negatives (FN): 1

Direct Computation of Derived Metrics

Scikit-learn also provides functions to compute derived metrics such as precision, recall, and F1 score, which can be useful for evaluating model performance.

Precision: The ratio of correctly predicted positive observations to the total predicted positives.

Output:

Precision: 0.8

Recall (Sensitivity): The ratio of correctly predicted positive observations to the all observations in the actual class.

Output:

Recall: 0.8

F1 Score: The weighted average of precision and recall.

Output:

F1 Score: 0.8000000000000002

Conclusion

The True Positive, True Negative, False Positive, and False Negative ratios are vital as they can accurately measure classification models. Computing these metrics is easy using scikit-learn; the `confusion_matrix` function allows for a level of depth in measuring performance. These values can be extracted so as to yield other metrics like precision, recall rate and F1 score, which provide additional understanding regarding model efficiency. Through the use of these tools, data scientists and machine learning engineers can ensure that their models are properly audited which distinguishes between high quality and accuracy of the models hence producing high accurate models needed for decision making of different applications.

Comment

Article Tags: