Area Under the Curve

We’ve looked at precision and recall, and we’ve learnt how these simple calculations can give us a much better understanding of where our model is performing, not just how well it’s performing overall.

We’re going to progress this on by revisiting recall under the guise of ‘Sensitivity’, and also introduce a new metric, Specificity.

Sensitivity – True Positive Rate (Recall)

Let’s quickly recap what recall is, from the previous post.

Recall calculates the True Positive Rate.

It aims to answer the question: What proportion of actual positives was identified correctly?

It simply measures our model’s ability to correctly identify the positive result.

We’ll be using this later.

Specificity – True Negative Rate

Lets get straight into what the formula for specificity looks like.

Specificity calculates what proportion of negative predictions were actually correct.

If sensitivity is how good our model is at getting it right when it predicts a positive result, specificity is how good our model is when it predicts a negative result.

Sometimes when we’re modelling, we care just as much about the negative result as the positive result. On its own, specificity is a valuable metric which we should be aware of.

However, combined with sensitivity, we can start thinking of our model performance as a careful balance, which can be very poweful.

ROC – Receiver Operating Characteristic

The ROC curve is a probability curve. It’s created by plotting the Sensitivity on the Y-axis, against 1 minus the Specificity on the X-axis (false negative rate).

AUC – Area Under the Curve

The AUC is literally, the area under the curve. It measures the entire 2D space underneath the curve. If you’ve done any high school maths, you’ll recognise this as the integral of the curve.

The best model has an AUC of 1.

Sensitivity and Specificity are inversely proportional to each other.
When we increase our threshold that we’re happy with for false negatives, it allows us to increase our true positive rate.

We can understand this also when we look at the curve.

In the above example, we see a significant improvement when we move the threshold from 0% to 10% for false negative rate, as the true positive rate increases to 40%.

At a false negative rate of around 25%, the true positive rate is now 70%. We can use this to understand how well the model is performing.

Putting this Into Practice

When you train your models, whether you’re using R, Python, SAS or anything else, it’s likely that the engine behind the scenes is running all this and will output you a ROC curve, and a summary AUC value as well.

If you know what you want your model to perform for, using these values can be super helpful.

Next Up – F1, the balancing metric

The ROC curve and AUC combination is a great visual way to understand where your model is doing well, coupled with an overall statistic that gives you a simple understanding.

Next, we’re going to extend these metrics once again, and look at the F1 value.