So you want to be a Data Ethicist?

Congratulations!

You will probably be one of the first in the world!

For my part, I call myself this because I research and actively consider the ethical implications of using data in society. And these implications only get more far-reaching as we collect more data.

I started my journey in this area by wanting to know how deep learning models made their ‘decisions’. I am yet to find a good answer to that question — but there is ongoing research to discover this.

But really, understanding the model is only one small part of a much bigger issue. A deep learning model — or any predictive algorithm — exists within an ecosystem of humans, procedures, and politics.

And throughout the whole process is — data. There are many questions that need to be asked of data: where has it come from? How representative is it? How is it manipulated to fit in a machine learning model? What is the model trying to optimise? Does the data retain its integrity throughout the process? Where does the output data go? Does the data get interpreted in an appropriate way? And so on…

So, I would say a Data Ethicist looks at data in its local context — to meet an organisation’s objective — as well as in a social context. It could be summed up with the question: we can do this, but should we?

A Survey of ‘Data Ethicist’ on the Internet

Google Trends (7Jun17): Green = automation; Blue = artificial intelligence; Yellow = machine learning; Red = deep learning

The interest in all things data and artificial intelligence is growing rapidly. We need Data Ethicists to manage the social implications. Don’t bother searching for ‘Data Ethicist’ within Trends though — it’s not on the radar!

A search on Google Scholar comes back with 4 results for “Data Ethicist”.

A standard Google web search comes back with 705 results. The vast majority of these results are syndicated snippets of one article, or in various ways, not very reliable. But of the more interesting items, I’ll provide a timeline below.

2017

boingboing, “Ethics and AI: all models are wrong, some are useful, and some of those are good”

An article about a presentation by Abe Gong.

“Gong reiterates the important point that the problem is rarely with algorithms, but rather with training data — machine learning just lets us hide our sampling bias behind a black box, providing a veneer of empiricism that overlays racism and discrimination.”

2016

NATIONAL SECURITY TELECOMMUNICATIONS ADVISORY COMMITTEE, “NSTAC Report to the President on Big Data Analytics” (PDF)

The report makes a number of recommendations calling for the involvement of a data ethicist in big data projects.

“Data Ethicist: One whose judgment on the ethics of data usage has come to be trusted by a specific community, and is expressed in some way that makes it possible for others to mimic or approximate that judgment.”

2015

Accenture, “The Case for Data Ethics”

“Forty-two percent of respondents to a survey taken at the Joint Statistical Meeting’s annual gathering in August 2014 agreed that ethical research standards should be in place for data scientists, while 43 percent said that ethics already plays “a big part” in their research.”

“By 2017, they expect that 25 percent of large enterprises will have a digital code of conduct, and by 2018, that fully half of business ethics violations will be caused by the improper use of big data analytics.”

“If the entity in question is a platform, the effects of lax ethical data practices are compounded and can ripple throughout systems — even more so if decision making is being performed by automated algorithms.”

“To protect public safety, these sense-and-respond systems must be governed by ethical algorithms.”

The Alliance for Media, Arts + Culture, “Data, Ethics, Community”

“I’m remembering some of the ideas that Anna Lauren Hoffman, the data ethicist from UC Berkeley, shared with us on this topic:

There are ethics to citizen science, and tension around the validity of data collected on the ground by individuals. This is a form of gatekeeping.
Think about privacy and social data — when working with communities, information about one often provides information about a whole group of people.
Think about the politics of platforms — how technology is designed and who the platforms include and exclude.
With data, we face some of the same issues as when we are thinking of information and design — you come in with a set of categories/classifications and are ready to put data in those boxes and they don’t fit into those boxes.”

2013

Information Week, “Why You’ll Need A Big Data Ethics Expert”

This is the go-to article for a few others online when they make a reference to ‘getting a data ethicist’.

“Along with big data technology developers, your company should be thinking about adding a ‘big data ethicist.’”

“The looming issue in big data isn’t technology but the decisions associated with how, when and if results should be provided.”