Contingency Table: Baseline vs. Text-STILT

Written by memeology | Published 2024/04/07
Tech Story Tags: meme-sentiment-analysis | text-stilt | unimodal-sentiment-analysis | unimodal-training | multimodal-meme-classifiers | meme-sentiment-classification | unimodal-data | sentiment-analysis

TLDRThis study introduces a novel approach, using unimodal training to enhance multimodal meme sentiment classifiers, significantly improving performance and efficiency in meme sentiment analysis.via the TL;DR App

Authors:

(1) Muzhaffar Hazman, University of Galway, Ireland;

(2) Susan McKeever, Technological University Dublin, Ireland;

(3) Josephine Griffith, University of Galway, Ireland.

Table of Links

Abstract and Introduction

Related Works

Methodology

Results

Limitations and Future Works

Conclusion, Acknowledgments, and References

A Hyperparameters and Settings

B Metric: Weighted F1-Score

C Architectural Details

D Performance Benchmarking

E Contingency Table: Baseline vs. Text-STILT

E Contingency Table: Baseline vs. Text-STILT

Table 8 shows the contingency table – as one would prepare for a McNemar’s Test between two classifiers (McNemar, 1947) – between the model trained with Text-STILT on 60% Memes and Baseline trained on 100% Memes available which had the most similar performance. While the two models performed similarly in terms of Weighted F1- scores, Text-STILT correctly classified a notable number of memes that Baseline did not and vice versa. Examples of such memes are discussed in Section 4.1. Furthermore, approximately 40% of memes in the testing set were incorrectly classified by both models. This suggests that these memes convey sentiment in a way that cannot be reliably predicted by either approach.

This paper is available on arxiv under CC 4.0 license.


Written by memeology | Memes are cultural items transmitted by repetition in a manner analogous to the biological transmission of genes.
Published by HackerNoon on 2024/04/07