COVIDFakeExplainer: An Explainable Machine Learning based Web Application: Abstract & Intro

This paper is available on arxiv under CC 4.0 license.

Authors:

(1) Dylan Warman, School of Computing, Charles Sturt University;

(2) Muhammad Ashad Kabir, School of Computing, Mathematics,.

Table of Links

Abstract

Fake news has emerged as a critical global issue, magnified by the COVID-19 pandemic, underscoring the need for effective preventive tools. Leveraging machine learning, including deep learning techniques, offers promise in combatting fake news. This paper goes beyond by establishing BERT as the superior model for fake news detection and demonstrates its utility as a tool to empower the general populace. We have implemented a browser extension, enhanced with explainability features, enabling real-time identification of fake news and delivering easily interpretable explanations. To achieve this, we have employed two publicly available datasets and created seven distinct data configurations to evaluate three prominent machine learning architectures. Our comprehensive experiments affirm BERT’s exceptional accuracy in detecting COVID-19-related fake news. Furthermore, we have integrated an explainability component into the BERT model and deployed it as a service through Amazon’s cloud API hosting (AWS). We have developed a browser extension that interfaces with the API, allowing users to select and transmit data from web pages, receiving an intelligible classification in return. This paper presents a practical end-to-end solution, highlighting the feasibility of constructing a holistic system for fake news detection, which can significantly benefit society.

Index Terms—COVID-19, machine learning, deep learning, fake news, explainability, web application, chrome extension

I. INTRODUCTION

Fake news is known by many interchangeable names, with the main two being the terms “Fake news” itself and “Misinformation” [1], [2]. These terms are used to mean false or misleading information shared to deceive an individual, group, or population into believing something that is clearly not true, often with political motivations with the intention of damaging public trust. In the context of this article, we refer to all of this as fake news [3].

Social networks provide a platform that millions of people around the world use to communicate and share information on a daily basis. However, especially at a time of global crisis such as during the COVID-19 pandemic, the amount of Fake news, being shared is staggering. Global statistics indicate that 74% people are very concerned about the amount of fake news they have seen during the pandemic [4], and furthermore, studies have shown that more than 50% of all social media users have spread fake news knowingly or unknowingly [5].

Further research shows that fake news is not more likely to be shared by robots or artificial intelligence, people were found to be more likely to spread fake news [6]. A potential reason for sharing fake news is identified as our cognitive biases, more specifically our memory biases, and a phenomenon known as the “false memory effect” [7]. Additionally, there has been shown to be a large disconnect between what people believe and what types of fake news they will share, furthering the idea that people share this information irrationally [8].

The magnitude of this problem has been underscored by the World Health Organization, which categorizes the proliferation of fake news during the COVID-19 pandemic as an “Infodemic” [9]. They explicitly emphasize that an Infodemic can pose as much risk to public health and security as the virus itself.

Given the clarity the World Health Organisation provides on the enormity of this issue, along with the consideration of how easily people with the best intentions can share fake news, the development of a tool that can provide a person with an understanding of what they are reading with regard to its legitimacy and validity is vital. Furthermore, considering how fake news can be used maliciously with the intent to harm public trust and cause unrest, it is important that people have the ability to protect themselves to prevent a continued deterioration of the public’s understanding of the pandemic among other topics.

Significant research has been conducted in this area recently, with machine learning (ML) approaches being the most widely implemented [2]. Incorporating explainability [10] could further allow us to trace the predictions generated by these ML approaches, in particular deep learning (DL) models which are considered as “black box”, and provide a contextual Therefore, explainability in fake news detection could significantly increase user trust [11] and would provide a concrete understanding to the user why an article is fake and could potentially lead to a reduction in the sharing of fake news.

This paper aims to develop an explainability tool/application that is embedded directly into a Google Chrome extension. In particular, this paper makes the following three major contributions:

• We have conducted an extensive empirical evaluation of state-of-the-art ML algorithms to train a fake news classification model using two different datasets with seven configurations.

• We have identified the most prominent explainability techniques and discussed their suitability for developing a web-based application.

• We have implemented a web application as a Google Chrome extension using the best-performed ML model and the most suitable explainability technique to demonstrate the suitability and usefulness of our approach.