Fight Crime with Social Network Analysis

Written by danielbarta | Published 2018/02/12
Tech Story Tags: data-science | social-network-analysis | crime | terrorism | police

TLDRvia the TL;DR App

Detectives need every help they can get, and social network analysis (SNA) is a potent tool in modern crime fighting. Being strongly relied on clean data, SNA is not powerful enough to act solely on such results, but it is a milestone in intelligence assessment.

What is a graph?

It’s a pretty basic concept: make some spots (more precisely, nodes or vertices), and connect them (connections usually called edges): it’s a graph. This concept is simple yet general enough to be used for tons problems, like modeling atoms, the spread of diseases, language processing, and ranking web pages, just to name a few. Also, it turned out to be extremely useful to study social behaviors.

Social Network Analysis

In 1932, there was a phenomenon without precedent in the New York Training School for Girls, in Hudson: 14 girls ran away within two weeks. Jacob Moreno, a leading social scientist, was hired to find an explanation. He modeled the individual’s relations, intelligence, and social activities. He created a model which revealed that the main driving force was their social environment; they behaved as their closest, most influential people around them did. This was one of the first well-documented cases, where analyzing social behaviors helped demystify complex group behaviors.

Jacob L. Moreno, an early social scientist, among others (Source)

Social Network Analysis models connections between people or groups with nodes (vertices, points, actors), and links (relationship, interactions) between them. Using different measures the structure of such social networks can be studied which can give answers to specific group behaviors.

A new dawn in law enforcement

Without much research, it is safe to say that in a criminal organization some people are more important than other. Unfortunately, in bigger and more complex organizations, it is not easy to decide who is more important. Also, it is also a fair assumption that those who although are not part of the organization yet, but already connected to the periphery of such network, are more prone to become criminal themselves. SNA is a tool to analyze these networks, identify the influential actors or the endangered. With these pieces of information in hand, law enforcement can execute more effective actions while it the same can reduce the number of such activities.

Data sources in crime prevention

As any analytical tools, SNA is no good without sufficient and reliable data. While there are multiple ways to gain information, each has their own disadvantages.

Co-offending data

Co-offending data can easily be acquired from criminal records (if one has the rights to those, of course). In a co-offending graph, there is an undirected connection between two individuals, if they committed a crime together. Relationships between individuals should be weighted, as probably some pairs appeared multiple times in the criminal record.

Co-offending network of a Swedish street gang (Source)

Surveillance data

Surveillance is a methodical way to acquire information, usually via electronic devices. The laws regarding surveillance differ from country to country and cause debates all around the world. For example, the Swedish police can only maintain such information, if the individual has already committed, or susceptible to commit a crime.

Surveillance and co-offending network of the same gang where the nodes are scaled by local clustering coefficient (Source)

Intelligence-based data

Intelligence is a more generalized concept; It includes surveillance and other, less formal information sets.

Intelligence based network of the Swedish street gang in 2007 and 2010 (Source)

Drawbacks

Seeing the difference between the co-offending and surveillance network, one might think that something is wrong here, and rightfully so. The problems stem from the fact, that by the nature of criminal organizations it is especially hard to gather sufficient quality data and all data sources have their problems:

  • Co-offending records can miss data; numerous crimes were probably never discovered, but it can also happen that individuals were wrongly accused of a crime.
  • Intelligence is usually based on human information, and as such, it probably reflects the view of the data source, which might or might not be close to reality.
  • Surveillance records are collected in cases that are deemed interesting. Thus, some incidences are way more documented than others.

Fundamental measures in social network analysis

Using the assumption that the intelligence managed to collect sufficient, not biased data, the next task would be to find central actors, persons who influence the network the most. While there are lots of measures (in fact, everything in graph theory can be applied), without the need for completeness, I listed some of the most fundamentals.

Degree centrality

The most obvious one: it counts how many edges a node has. In this interpretation, the more in and outgoing edges a node has, the more important the associated person is. For example, in the graph below, Node 5 has 3 edges, while Node 3 has two.

(Source)

Betweenness

Betweenness is another straightforward measure: it counts how many times the node in question appears in the shortest path of any two other nodes.

Eigenvector centrality

The previous measures assume that every node and every edge is equally important. Of course, in SNA usually, this is not the case. It can easily happen that a person has lots of connections, but all of these connections are with non-influential persons. Thus, while there might be dozens of relationships, the person cannot be considered as important.

With eigenvector centrality, the importance of a node is based on how many connections it has to other important nodes. Of course, the interesting question is how we can define “important”. In this case, a node is considered to be more important if it has lots of connections, or is connected to nodes that have lots of connections… Or connected to a node that is connected a node that has even more connections.

Source

This proved to be a handy measure, especially in bigger networks. For instance, Google’s famous PageRank algorithm to rank web pages uses eigenvector centrality, too.

Cliques

In a graph, a subset of nodes is called clique, if there is a connection between each nodes. Identifying these groups is a powerful technique to describe cohesive force within groups. Studies in politology and sociology revealed a strong link between group cohesion and susceptibility to radicalization. In social networks, practically it is more useful to use weaker definitions of the clique:

  • n-clique: a subset of nodes in which each point can be reached in at most n steps.
  • k-core: a subset of nodes in which each vertex has at least k connections in the subgroup.

{1,2,5} forms a clique. (Source)

Identifying these groups can help describe the structure of an organization. For example, in some cases, suicide bombers form a clique, in other cases, they are on the periphery, independently of each other. With cliques, ideological homogeneity and its effect on the network can be estimated.

Case studies

The American law enforcement has already been experimenting with the applications of social network analysis, and the results are convincing so far.

Disclaimer: this part is a summary of the webinar “Utilizing Social Network Analysis to Reduce Violent Crime”. Also, I took pictures from the slides. You can find those here.

Kansas City

Historically, Kansas City is one of the top 10 most violent cities in the United States. One of the most catching statistics is that only 13 square miles of 315 account for 47% of all homicides. In 2012, after the new city government was established, and they made significant changes. They established an organization called KC NoVa (Kansas City No Violence Alliance) with the purpose of reducing homicides and aggravated assaults.

KCPD provides social services to reduce violence (Source)

In 2014, based on all available information, the police visualized connections between 832 individuals. In this process, they identified 66 violent groups. 47.5% of these groups were considered extremely violent, and 13% were found as highly organized.

With graph measures, like betweenness centrality, NoVA called in individuals from each group. They notified these people that violence is not tolerated, and offered several services, like education and employment preparation.

Betweenness centrality used on a gang called Dime Block. Red indicates that the person has an arrest warrant (Source)

The results were overwhelming: homicides were reduced by a percent of 28, to the lowest amount in the last 30 years.

Kansas City monthly homocides (Source)

Chicago

It might be surprising, but the city of Al Capone is not among the most violent cities in the USA. Of course, there is room for improvement.

In August 2010, the Group Violence Reduction Strategy (VRS) was formed to solve and reduce cases of gang-related shootings. After collecting raw data from local PDs, they came to the following conclusions:

  • Old gangs based on nations are out of date and falling apart.
  • Gangs are now smaller in size, and geographically more centered.

VRS conducted two approaches. First, they used audits to identify the most active factions. One way to do that was to assign a node to every faction, and connect two if there was any reported conflict between them.

Conflict network of Chicago to identify the most active networks. (Source)

Next, they investigated the social structure of the most active groups to identify the most important individuals. In order to achieve that, they created a co-offending network, by making connections between individuals if there was a recorded crime in which they partnered.

Co-offending network to identify the most influential persons in a group (Source)

The results speak for themselves in Chicago, too. 23% reduction in overall shootings, 32% reduction in victimization in factions where they utilized SNA.

Results after 12 months of evaluation (Source)

TL;DR

Social Network Analysis (SNA) is a modern and highly efficient tool for the detectives to see criminal organizations from another perspective. Unfortunately, because the underlying data has a good chance to be insufficient or biased, decisions cannot be made solely on such measures.

Still, as authorities tend to use proactive strategies more extensively, SNA is a vital tool to a more effective intelligence assessment. Results in the USA and Scandinavia show that with competent usage, SNA can result in significant improvements in the way authorities operate.

If you want to know more about the subject, check out these excellent sources I based this article on:


Published by HackerNoon on 2018/02/12