Australia, we urgently need to talk about data ethics

An earlier version of this article was published on Ellen’s blog.

Centrelink’s debt recovery woes perfectly illustrate the human side of data modelling.

The Department for Human Services issued 169,000 debt notices after automating its processes for matching welfare recipients’ reported income with their tax. Around one in five people are estimated not to owe any money. Stories abounded of people receiving erroneous debt notices up to thousands of dollars that caused real anguish.

Coincidentally, as this unfolded, one of the books on my reading pile was Weapons of Math Destruction by Cathy O’Neil. She is a mathematician turned quantitative analyst turned data scientist who writes about the bad data models increasingly being used to make decisions that affect our lives.

Reading Weapons of Math Destruction as the Centrelink stories emerged left me thinking about how we identify ‘bad’ data models, what ‘bad’ means and how we can mitigate the effects of bad data on people. How could taking an ethics based approach to data help reduce harm? What ethical frameworks exist for government departments in Australia undertaking data projects like this?

Bad data and ‘weapons of math destruction’

A data model can be ‘bad’ in different ways. It might be overly simplistic. It might be based on limited, inaccurate or old information. Its design might incorporate human bias, reinforcing existing stereotypes and skewing outcomes. Even where a data model doesn’t start from bad premises, issues can arise about how it is designed, its capacity for error and bias and how badly people could be impacted by error or bias.

Weapons of math destruction tend to hurt vulnerable people most.

A bad data model spirals into a weapon of math destruction when it’s used en masse, is difficult to question and damages people’s lives.

Weapons of math destruction tend to hurt vulnerable people most. They might build on existing biases – for example, assuming you’re more likely to reoffend because you’re black or you’re more likely to have car accidents if your credit rating is bad. Errors in the model might have starker consequences for people without a social safety net. Some people may find it harder than others to question or challenge the assumptions a model makes about them.

Unfortunately, although O’Neil tells us how bad data modelling can lead to weapons of math destruction, it doesn’t tell us much about how we can manage these weapons once they’ve been created.

Better data decisions

We need more ways to help data scientists and policymakers navigate the complexities of projects involving personal data and their impact on people’s lives. Regulation has a role to play here. Data protection laws are being reviewed and updated around the world.

For example, in Australia the draft Productivity Commission report on data sharing and use recommends the introduction of new ‘consumer rights’ over their personal data. Bodies such the Office of the Information Commissioner help organisations understand if they’re treating personal data in a principled manner that promotes best practice.

Guidelines are also being produced to help organisations be more transparent and accountable in how they use data to make decisions. For instance, The Open Data Institute in the UK has developed openness principles designed to build trust in how data is stored and used. Algorithmic transparency is being contemplated as part of the EU Free Flow of Data Initiative and has become a focus of academic study in the US.

Ethics can help bridge the gap between compliance and our evolving expectations of what is fair and reasonable data usage.

However, we cannot rely on regulation alone. Legal, transparent data models can still be ‘bad’ according to O’Neil’s standards. Widely known errors in a model could still cause real harm to people if left unaddressed. An organisation’s normal processes might not be accessible or suitable for certain people – the elderly, ill and those with limited literacy – leaving them at risk. It could be a data model within a sensitive policy area, where a higher duty of care exists to ensure data models do not reflect bias. For instance, proposals to replace passports with facial recognition and fingerprint scanning would need to manage the potential for racial profiling and other issues.

Ethics can help bridge the gap between compliance and our evolving expectations of what is fair and reasonable data usage. O’Neil describes data models as “opinions put down in maths”. Taking an ethics based approach to data driven decision making helps us confront those opinions head on.

Building an ethical framework

Ethics frameworks can help us put a data model in context and assess its relative strengths and weaknesses. Ethics can bring to the forefront how people might be affected by the design choices made in the course of building a data model.

An ethics based approach to data driven decisions would start by asking questions such as:

  • Are we compliant with the relevant laws and regulation?
  • Do people understand how a decision is being made?
  • Do they have some control over how their data is used?
  • Can they appeal a decision?

However, it would also encourage data scientists to go beyond these compliance oriented questions to consider issues such as:

  • Which people will be affected by the data model?
  • Are the appeal mechanisms useful and accessible to the people who will need them most?
  • Have we taken all possible steps to ensure errors, inaccuracies and biases in our model have been removed?
  • What impact could potential errors or inaccuracies have? What is an acceptable margin of error?
  • Have we clearly defined how this model will be used and outlined its limitations? What kinds of topics would it be inappropriate to apply this modelling to?

There’s no debate right now to help us understand the parameters of reasonable and acceptable data model design. What’s considered ‘ethical’ changes as we do, as technologies evolve and new opportunities and consequences emerge.

Bringing data ethics into data science reminds us we’re human. Our data models reflect design choices we make and affect people’s lives. Although ethics can be messy and hard to pin down, we need a debate around data ethics.


Australia Day and #changethedate - a tale of two truths

The recent debate about whether or not Australia Day should be celebrated on 26th January has been turned into a contest between two rival accounts of history.

On one hand, the ‘white arm band’ promotes Captain Arthur Phillip’s arrival in Port Jackson as the beginning of a generally positive story in which the European Enlightenment is transplanted to a new continent and gives rise to a peaceful, prosperous, modern nation that should be celebrated as the envy of the world.

On the other hand, the ‘black arm band’ describes the British arrival as an invasion that forcefully and unjustly dispossesses the original owners of their land and resources, ravages the world’s oldest continuous culture, and pushes to the margins those who had been proud custodians of the continent for sixty millennia.

This contest has become rich pickings for mainstream and social media where, in the name of balance, each side has been pitched against the other in a fight that assumes a binary choice between two apparently incommensurate truths.

However, what if this is not a fair representation of the what is really at stake here? What if there is truth on both sides of the argument?

The truth – that is, the whole truth – is that the First Fleet brought many things. Some were good and some were not. Much that is genuinely admirable about Australia can be traced back to those British antecedents. The ‘rule of law’, the methods of science, the principle of respect for the intrinsic dignity of persons… are just a few examples of a heritage that has been both noble in its inspiration and transformative in its application in Australia.

Of course, there are dark stains in the nation’s history – most notably in relation to the treatment of Indigenous Australians. Not only were the reasonable hopes and aspirations of Indigenous people betrayed – so were the ideals of the British who had been specifically instructed to respect the interests of the Aboriginal peoples of New Holland (as the British called their foothold on the continent).

The truth – that is, the whole truth – is that both accounts are true. And so is our current incapacity to realise this.

The truth – that is, the whole truth – is that the arrival of the Europeans was a disaster for those already living here for generations beyond human memory. This was the same kind of disaster that befell the Britons with the arrival of the Romans, the same kind of disaster visited on the Anglo-Saxons when invaded by the Vikings and their Norman kin. Land was taken without regard for prior claims. Language was suppressed, if not destroyed. Local religions trashed. All taken – by conquest.

No reasonable person can believe the arrival of Europeans was not a disaster for Indigenous people. They fought. They lost. But they were not defeated. They survive. Some flourish. Yet with only two hundred or so years having passed since European arrival, the wounds remain.

The truth – that is, the whole truth – is that both accounts are true. And so is our current incapacity to realise this. Instead we are driven by politicians and commentators and, perhaps, the temper of the times, to see the world as one of polar opposites. It is a world of winners and losers, a world where all virtue is supposed to lie on just one side of a question, a world in which we are cut by the brittle, crystalline edges of ideological certainty.

So, what are we to make of January 26th? The answer depends on what we think is to be done on this day.

One of the great skills cultivated by ethical people is the capacity for curiosity, moral imagination and reasonable doubt. Taken together, these attributes allow us to see the larger picture – the proverbial forest that is obscured by the trees. This is not an invitation to engage in some kind of relativism – in which ‘truth’ is reduced to mere opinion. Instead, it is to recognise that the truth – the whole truth – frequently has many sides and that each of them must be seen if the truth is to be known.

But first you have to look. Then you have to learn to see what might otherwise be obscured by old habits, prejudice, passion, anger… whatever your original position might have been.

So, what are we to make of January 26th? The answer depends on what we think is to be done on this day. Is it a time of reflection and self-examination? If so, then January 26th is a potent anniversary. If, on the other hand, it is meant to be a celebration of and for all Australians, then why choose a date which represents loss and suffering for so many of our fellow citizens?