Numbers Don't Lie But They Are Easily Misinterpreted All The Time

Let's start with an obvious example(example 1):
1. Virus A has an average fatality rate of 10%(1 death per 10 infections on average)
2. Virus B has an average fatality rate of 1%(1 death per 100 infections on average)

Which virus is more dangerous towards the majority?

If you think that the answer must be always virus A, then you're probably very prone to misinterpreting the numbers, because you're effectively passing judgments with too little information in this case.

What if I give you their infection rates as well?

1. Virus A has an average infection rate of 2 every week(every
infected individual infects 2 previously uninfected ones per week on
average)

2. Virus B has an average infection rate of 5 every week(every
infected individual infects 5 previously uninfected ones per week on
average)

First, let's do some math on the estimated death numbers after 4 weeks:

1. Virus A death numbers = 2 ^ 4 * 0.1 = 1.6

2. Virus B death numbers = 5 ^ 4 * 0.01 = 6.25

The counterparts after 8 weeks:

1. Virus A death numbers = 2 ^ 8 * 0.1 = 25.6

2. Virus B death numbers = 5 ^ 8 * 0.01 = 3906.25

I think it's now clear enough that, as time progresses, the death
numbers by virus B over that of virus A will only be larger and larger,
so this case shows that the importance of infection rates can easily
outclass that of the death rates when it comes to evaluating the danger
of a virus towards the majority.

Of course, this alone doesn't mean that virus B must be more
dangerous towards the majority, but this is just an easy, simple and
small example showing that how numbers can be misinterpreted, because in this case, judging from a single metric alone is normally dangerous.

Now let's move on to a more complicated and convoluted example(example 2):

1. Country A, having 1B people, has 1k confirmed infection cases of
virus C after 10 months of the 1st confirmed infection case of that
virus in that country

2. Country B, having 100M people, has 100k confirmed infection cases of
virus C after 1 month of the 1st confirmed infection case of that virus
in that country

Which country performed better in controlling the infections of virus C so far?

Now there are 3 different yet interrelated metrics for each country,
so the problems of judging from a single metric is gone in this example,
therefore this time you may think that it's safe to assume that country
A must have performed better in controlling the infections of virus C
so far.

Unfortunately, you're likely being fooled again, especially when I
give you the numbers of tests over virus C performed by each country on
that country:

1. Country A - 10k tests performed over virus C on that country

2. Country B - 10M tests performed over virus C on that country

This metric on both country, combined with the other metrics, reveal 2 new facts that point to the opposite judgment:

1. Country A has just performed 10k / 10 / 1B = 0.0001% number of tests
over virus C on that country over its populations per month on average,
while country B has performed 10M / 100M = 10% on that regard

2. 1k / 10k = 1 case out of 10 tested ones is infected in country A on
average, while that in country B is 100k / 10M = 1 out of 100

So, while it still doesn't certainly imply that country B must have
performed better in controlling the infections of virus C so far, this
example aims to show that, even using a set of different yet
interrelated metrics aren't always safe from misinterpreting them all.

So, why numbers can be misinterpreted so easily? At the very
least, because numbers without contexts are usually ambiguous or even meaningless, and realizing the existence of the missing contexts
generally demands relevant knowledge.

For instance, in example 2, if you don't know the importance of the
number of tests, it'd be hard for you to realize that even the other 3
metrics combined still don't form a complete context, and if most people
around the world don't know that, some countries can simply minimize
the number of tests performed over virus C on those countries, so the
numbers will make them look like that they've been performing incredibly
well in controlling the infections of virus C so far, meaning that numbers without contexts can also lead to cheating by being misleading rather than outright lying.

Sometimes, contexts will always be incomplete even when
you've all the relevant numbers because some contexts contain some important details that are very hard to be quantified, so when it comes to relevant knowledge, knowing those details are crucial as well.

Let's consider this example(example 3) of a team of 5 employees who
are supposed to handle the same set of support tickets every day, and
none of them will receive any overtime compensations(actually having
overtime will be perceived as incompetence there):

1. Employee A, B, C, and D actually work on the supposed 40-hour work
week every week, and each of them handles 20 support tickets(all handled
properly) per day on average

2. Employee E actually works on 80 hour work week on average instead of
the supposed 40, and he/she handles 10 support tickets(all handled
properly) per day on average

Does this mean employee E is far from being on par with the rest of
the team? If you think the answer must be always yes, then I'm afraid
that, you've yet again misused those KPIs, because in this case, the
missing contexts at least include the average difficulty of the support
tickets handled by those employees, and such difficulty is generally
very hard to quantify.

You may think that, as all those 5 employees are supposed to handle
the same set of support tickets, the difficulty difference among the
support tickets alone shouldn't cause such a big difference among the
apparent productivity between employee A, B, C, and D, and employee E.

But what if I tell you that, it's because the former 4 employees have
been only taking the easiest support tickets since day 1, and all the
hardest ones are always taken by employee E, which is due to the
effectively dysfunctional internal reporting mechanisms against such
workplace bullying and employee E is especially vulnerable to such
abuses?

Again, whether that team is really that toxic is also very hard to be
quantified, so in this case, even if you've all the relevant KPIs on
the employee performance, those KPIs as a single set can still be very
misleading when it's used on its own to judge their performance.

Of course, example 3 is most likely an edge case that shouldn't happen, but that doesn't mean such edge cases will never appear.

Unfortunately, many of those using the KPIs to pass judgment do act
as if those edge cases won't ever exist under their management, and even
if they do exist, those guys will still behave like it's those edge
case themselves that are to be blamed, possibly all for the illusory
effectiveness and efficiencies.

To be blunt, this kind of "effectiveness and efficiency" is
indeed just pushing the complexities that should be at least partially
handled by those managers to those edge case themselves,
causing the latter to suffer way more than what they've been already
suffering even without those extra complexities that are just forced
onto them.

While such use of KPIs do make managers and the common cases much
more effective and efficient, they're at the cost of sacrificing the
edge cases, and the most dangerous part of all is that, too often, many
of those managers and common cases don't even know that's what they've
been doing for ages.

Of course, this world's not capable to be that ideal yet, so
sometimes misinterpreting the numbers might be necessary or lesser
evil, because occasionally, the absolute minimum required effectiveness
and efficiencies can only be achieved by somehow sacrificing a small
amount of edge cases, but at the very least, those using the
KPIs that way should really know what they're truly doing, and make sure they make such sacrifices only when they've to.

So, on one hand, judging by numbers alone can easily lead to utterly
wrong judgments without knowing, while on the other hand, judging only
with the full context isn't always feasible, practical nor realistic,
therefore a working compromise between these 2 extremes should be found on a case-by-case basis.

For instance, you can first form a set of educated hypotheses
based on the numbers, then try to further prove and disprove(both sides must be worked on) those hypotheses on one hand, and act upon them(but always keep in mind that those hypotheses can be all dead wrong) if you've to on the other, as long as those hypotheses haven't been proven to be all wrong yet(and contingencies should be planned for so you can fix the problems immediately).

With such a compromise, effectiveness and efficiency can be
largely preserved when those hypotheses work because you're still not delaying too much when passing judgments, and the damages caused by those hypotheses when they're wrong can also be largely controlled and contained because you'll be able to realize and correct your mistakes as quickly as possible.

For instance, in example 3, while it’s reasonable to form the
hypothesis that employee E is indeed far from being on par with the rest
of the team, you should, instead of just acting on those numbers
directly, also try to have a personal meeting with that employee as soon
as possible, so you can express your concerns on those metrics to
him/her, and hear his/her side of the story, which can be very useful on
proving or disproving your hypothesis, causing both of you to be able
to solve the problem together in a more informed manner.