Performance metrics of the classification model

Understanding confusion matrix using a made up case

In [1]:
import numpy as np
from sklearn.metrics import confusion_matrix, accuracy_score, classification_report
In [2]:
def create_data(size, seed_1, seed_2):
    data = np.zeros(size)
    data[np.random.randint(0,size,size//9)] = 1
    data[np.random.randint(0,size,size//100)] = 1   #To create some common values
    return data

For scoring functions need test values and predicted values
So, we ll make up some data for this two

In [3]:
y_pred = create_data(1000,1,2)  #arguments=data size, random seed values
In [4]:
y_test = create_data(1000,1,3)

We'll capture matched 1's common 0's
and unmatched 1's and 0's

In [5]:
# Counts of
matched_1 = 0
matched_0 = 0
unmatched_1 = 0
unmatched_0 = 0

for i in range(len(y_pred)):
    if y_pred[i] == y_test[i]:
        if y_pred[i] ==1:
            matched_1 +=1
            matched_0 +=1
        if y_pred[i]==1:
            unmatched_1 +=1
            unmatched_0 +=1
In [6]:
matched_0, matched_1, unmatched_0, unmatched_1
(874, 108, 10, 8)

The same values we get from confusion matrix

In [7]:
cm = confusion_matrix(y_test,y_pred)
array([[874,   8],
       [ 10, 108]])

Same analogy applied for predicting disease or spam mails
0 for NO and 1 for YES

True Positives

Actual value 1 and Predicted value 1

True Negatives

Actual value 0 and Predicted value 0

False Positives

Actual value 0 but Predicted value 1

False Negatives

Actual value 1 but Predicted value 0

Here True or False for matched or unmatched respectives
Positive and Negatives are for Predicted 1 and 0


$\dfrac{Correct \space predictions}{Total \space predictions}$

True values /Total values



Same using sklearn

In [8]:


$\dfrac{TP}{Predicted \space 1's}$

So, of all the items we predited 1, how many are correct




$\dfrac{TP}{Actual \space 1's}$

So, of all the items actually 1, how many we predicted correctly



Classification Report

Using sklearn

In [9]:
print(classification_report(y_test, y_pred))
             precision    recall  f1-score   support

        0.0       0.99      0.99      0.99       882
        1.0       0.93      0.92      0.92       118

avg / total       0.98      0.98      0.98      1000

Classification report gives for both 0's and 1's

Precision and Recall for 0's are very similar to the one we calculated for 1's
Instead of TP, we need to use TN and predicted 0's and actual 0's respectively

F-Score is the average of Precision and Recall

Support is Actual 0's and Actual 1's

Ignore the below cell

In [10]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))