The Outbreak: Detecting Fake Viral News, Automatically

Written by baditaflorin | Published 2016/12/09
Tech Story Tags: social-media | facebook | fake-news | donald-trump | entrepreneurship | facebook-fake-news | cambridge-analytica

TLDR The Outbreak is a tool designed to automatically identify viral news, before they go viral. Using the Outbreak, we detect just the ones that are getting viral, and send them to a human to evaluate if this is a fake or accurate article. Using this, we can train the algorithm to do predictive analysis. The commonality of this fake articles is that they generate strong emotional responses. This can be detected using Google Cloud NLP API or Watson API. Combined with facebook reactions data, we have better tools then just one year ago.via the TL;DR App

Two weeks ago i published this post about how we can detect fake viral news, using the Outbreak, a tool designed at automatically identify viral news, before they go viral.
Working non-stop for 2 weeks, i now started tracking over 4000 Facebook posts coming from 30 Facebook pages that usually post fake or misleading news, to see if i can identify fake viral news, before they go viral.
Instead of having journalists that need to go and read every post and article posted by this websites, we detect just the ones that are getting viral, and send them to a human to evaluate if this is a fake or accurate article.
I chosen as a study-case a Article published on 6th December.
A quick visit to Snopes reveal the real story.
  • The article, as the time of this posting, got over 32K Shares in total, 10K of that shares originate from the fb page Conservative News Today, a page with 2.5M Likes.
  • In the first hour when the system detected the link, it had 236 shares.
  • By the second hour, it was up to 1630 shares. This is when the system flagged the article as a viral article.
5 more hours needed to pass until the article had over 5000 shares.
The snopes article was out on Dec 07, 2016, one day later, and got in total 484 shares.
By the same time, the fake article had 8000 shares on the fb page alone.
And this is just one of the parts. We also have the number of likes, comments and emotions. Using this, we can train the algorithm to do predictive analysis.
One of the commonality of this fake articles is that they generate strong emotional responses. And this can be detected using Google Cloud NLP API or Watson API. Combined with facebook reactions data, we can get a better understanding and we have better tools then just one year ago.
The vertical axes is the number of reaction for each specific emotion, and on the horizontal axes is the number of likes
I want to continue working on this project, this is why i set-up this crowdfunding campaign with the aim of raising $5000 to be able to get a better server infrastructure and pay for some part of the time that i dedicate to this project.
Help support this effort: https://www.youcaring.com/antivirus-against-fake-news

Sorry for the language mistakes in the article, English is not my first language.
About Me
In the last 3 years i`m a collaborator with the Organised Crime and Corruption Reporting Projects (OCCRP), were i do data analysis and pattern recognition to uncover patterns of corruption in unstructured datasets.
In September 2016 I have moved to San Francisco, to start a new life here.

Published by HackerNoon on 2016/12/09