Handling Imbalanced Class Issues with Matthew’s Correlation Coefficient

Written by stylianoskampakis | Published 2022/05/02
Tech Story Tags: machine-learning | imbalanced-class | model-evaluation | mcc | r | cohen's-kappa | data-science | confusion-matrix-in-ml

TLDRSometimes in data science and machine learning we encounter problems of imbalanced classes. These are problems when one class might have more instances than another. One metric that helps with this problem is Matthew’s Correlation Coefficient. The MCC takes values between -1 and 1, a score of 1 indicates perfect agreement. But how does the MCC compare against other popular metrics for imbalanced class problems? How do they compare against the F1-score? An argument is made in favour of MCC, but Ive found that in practice both metrics give similar results.via the TL;DR App

Sometimes in data science and machine learning, we encounter problems of imbalanced classes. These are problems when one class might have more instances than another. This makes accuracy a bad metric. If class A shows up 90% in our sample and the other one 10%, then we can simply get 90% accuracy by always predicting class A.

One metric that helps with this problem is Matthew’s Correlation Coefficient (MCC), which was introduced in the binary setting by Matthews in 1975. Before we show the calculation for the MCC, let’s first revisit the concept of a confusion matrix. As you can see in the image below, a confusion matrix has four cells created by combining the predicted values against the real values. Two of those cells represent correct predictions (True Positives and True Negatives), and the other represents incorrect predictions (False Positives and False Negatives).

Matthew’s correlation coefficient is calculated as follows:

The MCC takes values between -1 and 1. A score of 1 indicates perfect agreement. But how does the MCC compare against other popular metrics for imbalanced classes?

Matthew’s correlation coefficient vs. the F1-score

The F1-score is another very popular metric for imbalanced class problems. The F1-score is calculated as:

So, it is simply the harmonic mean of precision and recall. According to a paper, the MCC has two advantages over the F1-score.

  1. F1 varies for class swapping, while MCC is invariant if the positive class is renamed negative and vice versa.

  2. F1 is independent of the number of samples correctly classified as negative.

Therefore, it is argued that the MCC is a complete measure compared to F1.

Matthew’s correlation coefficient vs. Cohen’s Kappa

Cohen’s kappa is one of my favorite measures and one that we’ve written about on this blog. It’s a great metric for imbalanced class problems. This paper compares the two metrics. An argument is made in favor of the MCC, but I personally believe that it’s too theoretical. I’ve found that in practice both metrics give similar results, and I am using both in all my projects.

Using the MCC in Python and R

Using the MCC in python is very easy. You can just use Scikit Learn’s metrics API. The MCC can be executed through the function matthews_corrcoef. In R, you can use the function mcc from the mltools package.


Previously published here.


Written by stylianoskampakis | My name is Stylianos (Stelios) Kampakis and I am a data scientist.
Published by HackerNoon on 2022/05/02