You are here:
  1. Home
  2. News
  3. Blog
  4. Can algorithms ever be fair?

Can algorithms ever be fair?

17 October 2018

Algorithms and data are increasingly influencing the justice system. Professor Sofia Olhede and Dr Patrick Wolf explain the basics all lawyers should understand. 

 

What is an algorithm?

An algorithm is a list of rules that are automatically followed in step by step order to solve a problem.

When considering their consequences, algorithms cannot be separated from the data used to operate them. Indeed, dirty data can trip up the cleanest algorithm, leading to questions of fairness and bias in automated decision-making that we shall explore.

Machine learning

Machine learning is a category of algorithm that allows software applications to become more accurate in predicting outcomes without being explicitly programmed.

As researchers in 2017 summed up:

 

"Machine learning can impact people with legal or ethical consequences when it is used to automate decisions in areas such as insurance, lending, hiring, and predictive policing. In many of these scenarios, previous decisions have been made that are unfairly biased against certain subpopulations, for example those of a particular race, gender, or sexual orientation. Since this past data may be biased, machine learning predictors must account for this to avoid perpetuating or creating discriminatory practices."

 

Algorithms in decision making

Recent years have raised considerable debate about algorithm-assisted decisions and fairness, especially in legal and policy contexts. See the 2017 UK parliamentary inquiry: Algorithms in decision-making inquiry.

The watershed moment was perhaps the deployment of a commercial system in the US called Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), designed to provide risk assessments to assist the criminal justice system. Public attention swiftly followed a ProPublica  2016 story about COMPAS titled Machine Bias: There's software used across the country to predict future criminals. And it's biased against blacks. This led to a wider, ongoing debate about bias and fairness in the context of automated decision-making.

Bias and statistics

Bias has a recognised technical meaning in the field of statistics, starting with a population from which we wish to draw conclusions (such as the population of all criminal offenders in a given jurisdiction, and their average likelihood to re-offend) based on a sample of collected data, and an algorithm which calculates a conclusion using these data as a set of self-executing steps.

If the algorithm and data sampling mechanism is systematically "off", so that the calculation is not correct "on average" with respect to the entire population, then we say the resulting conclusions are biased.

Typical bias examples:

  • selection bias - drawing conclusions, say, on the basis only of offenders who were apprehended, rather than all offenders
  • reporting bias -  where, by polling offenders directly, we might expect them to under-report their own likelihood to re-offend 

Bias can be insidious and hard to recognise - even more so as data become dirtier and algorithms more complex. Some vivid examples are provided by confusing reported pothole prevalence with smartphone prevalence, or CEOs with white men. A moment's reflection will convince you that bias can be present because we are not getting a complete picture of the entire population from which we wish to draw conclusions, and also because humans have drawn biased conclusions before, and algorithms are picking up on these and simply codifying them into practice.

What is fair?

Even setting bias aside, how might we characterise an algorithmic conclusion as "fair"? To begin to answer this question, we must reconcile the dictionary definition of fairness, which gives us the concept of equal treatment of those in the population about which conclusions are drawn, devoid of favouritism or discrimination, with a mathematical definition applicable to algorithms.

For an algorithm (a set of self-executing steps), assessing the fairness of treatment would come back to the mathematical principle underlying the direction of the self-executing steps. Usually this principle can be cleanly stated as an algorithmic design criterion, such as: "Assign a likelihood-of-re-offence score that is maximally consistent with known re-offences in a certain set of historical data." This principle is the same one by which websites recommend new products for you to try, but in the context of the justice system we can see quite clearly how issues of algorithmic bias and fairness might arise!

Algorithmic design criteria

In the discussion surrounding COMPAS and criminal risk assessments, squaring the dictionary and mathematical definition of "fair" quickly led to new and newly relevant questions for algorithm researchers to consider.

Even the simplest notions of fairness must, once formulated mathematically, be understood at the level of individuals affected by algorithmic conclusions and decisions, and at the level of groups of individuals with common characteristics. 

Much recent research has focused on how algorithmic design criteria might need adjustments to avoid taking account of legally protected characteristics that we might find in data, such as gender or race. As gender and race are often associated with other measured variables that may not be legally protected, doing so becomes rapidly very complex. 

It might well be "fair" and appropriate in some abstract sense to adjust algorithmic conclusions systematically for a given group –  such as offenders in a certain age bracket – whereas adjusting conclusions differently for individuals within the group might be seen as unfair.  Such group-level adjustments are also possible within the setting of algorithmic decision-making.

Biased samples of data can obviously lead to unfairness because we inherit the bias in the data. But what is "fair" in a more abstract setting? This is more contentious, and arises in settings where groups of individuals might claim unequal treatment under the law. For example, city-dwellers might be less likely (on the basis, say, of police presence per capita) to be apprehended for a given class of criminal act than country-dwellers.

Should an algorithm then account for this, and treat apprehended offenders differently if they live in different-sized locales? More data gives us more options, and potentially the ability to give a more nuanced response in any particular decision about an individual--just as we would expect experienced judges to do in criminal law. 

The questions remain as to when and whether more nuanced responses ought to be applied, and unlike relying on the experience of a judiciary, we have yet to formalise algorithmic decision-making to the point where it can take into account reason, common sense and precedent, as well as a modicum of prudent personal wisdom, assessment and judgement.

 Views expressed in our blogs are those of the authors and do not necessarily reflect those of the Law Society.

Research

A Chouldechova, Fair Prediction with Disparate Impact: A study of bias in recidivism prediction instruments, Big data 5 (2), 153-163.

Kleinberg, Mullainathan and Raghavan, Inherent Trade-Offs in the Fair Determination of Risk Scores, Proceedings of Innovations in Theoretical Computer Science (ITCS), 2017.

Kusner, M.; Loftus, C.; Russell, C. and Silva, R. (2017) Counterfactual Fairness. Advances in Neural Information Processing Systems (NIPS) 30

Dwork C, Hardt M, Pitassi T, Reingold O, Zemel RS. 2012 Fairness through awareness. In Goldwasser, S. (ed) Innovations in theoretical computer science, pp. 214–226. New York, NY:Association for Computing Machinery.

Join us for our second Technology and Law Policy Commission evidence session on Monday 12 November 2018 13:00-16:00 in London with our president Christina Blacklaws, Royal Statistical Society fellow Sofia Olhede UCL Big Data, Sylvie Delacroix Uni of Birmingham & guest commissioner Sir William Blair, chair of Financial Law and Ethics at the Centre for Commercial Law Studies. Book your free place

Tags: knowledge management | technology | artificial intelligence

About the author

Professor Sofia Olhede is a co-chair on the Law Society's Technology and the Law Policy Commission. She is a professor of Statistics, an honorary professor of Computer Science and a senior research associate of Mathematics at University College London. She holds a European Research Council Consolidator fellowship. Sofia has contributed to the study of stochastic processes; time series, random fields and networks. She is on the ICMS Programme Committee since September 2008, a member of the London Mathematical Society Research Meetings Committee, a member of the London Mathematical Society Research Policy Committee and an associate Editor for Transactions in Mathematics and its Applications.

About the author

Dr Patrick Wolfe is a data science lecturer and the Frederick L. Hovde Dean of Science, Purdue University. He is a trustee and non-executive director of the Alan Turing Institute. He has provided expert advice on applications of data science to policy, societal, and commercial challenges, including to the U.S. and U.K. governments and to a range of public and private bodies.

  • Share this page:
Authors

Adam Johnson | Adele Edwin-Lamerton | Ahmed Aydeed | Alex Barr | Alex Heshmaty | Alexa Lemzy | Alexandra Cardenas | Amanda Carpenter | Amanda Jardine Viner | Amy Bell | Amy Heading | Andrew Kidd | Andy Harris | Anna Drozd | Annaliese Fiehn | Anne Morris | Anne Waldron | Asif Afridi and Roseanne Russell | Bansi Desai | Barbara Whitehorne | Barry Wilkinson | Becky Baker | Ben Hollom | Bob Nightingale | Caroline Marlow | Caroline Roddis | Caroline Sorbier | Catherine Dixon | Chris Claxton-Shirley | Christina Blacklaws | Ciaran Fenton | CV Library | Daniel Matchett | Daphne Perry | David Gilroy | David Yeoward | Douglas McPherson | Dr Sylvie Delacroix | Duncan Wood | Eduardo Reyes | Elizabeth Rimmer | Emily Miller | Emily Powell | Emma Maule | Gary Richards | Gary Rycroft | Graham Murphy | Gustavo Bussmann | Hayley Stewart | Ignasi Guardans | James Castro Edwards | Jayne Willetts | Jeremy Miles | Jerry Garvey | Jessie Barwick | Joe Egan | Jonathan Andrews | Jonathan Fisher | Jonathan Smithers | Julian Hall | Julie Ashdown | Julie Nicholds | Justin Rourke | Karen Jackson | Kate Adam | Katherine Cousins | Kaweh Beheshtizadeh | Kayleigh Leonie | Keiley Ann Broadhead | Kerrie Fuller | Kevin Poulter | Larry Cattle | Laura Bee | Laura Devine | Laura Uberoi | Leah Glover and Julie Ashdown | Leanne Yendell | LHS Solicitors | Lucy Parker | Maria Shahid | Marjorie Creek | Mark Carver | Mark Leiser | Markus Coleman | Martin Barnes | Matt Oliver | Matthew Still | Melissa Hardee | Neil Ford | Nick Denys | Nick O'Neill | Nick Podd | Nikki Alderson | Oz Alashe | Patrick Wolfe | Paul Rogerson | Pearl Moses | Penny Owston | Peter Wright | Philippa Southwell | Preetha Gopalan | Rachel Brushfield | Ranjit Uppal | Richard Coulthard | Richard Heinrich | Richard Messingham | Richard Miller | Richard Roberts | Rita Oscar | Rob Cope | Robert Bourns | Robin Charrot | Rosy Rourke | Saida Bello | Sally Azarmi | Sally Woolston | Sam De Silva | Sara Chandler | Sarah Austin | Sarah Crowe | Sarah Henchoz | Sarah Smith | Shereen Semnani | Sofia Olhede | Sonia Aman | Sophia Adams Bhatti | Sophie O'Neill-Hanson | Steve Deutsch | Steve Thompson | Stuart Poole-Robb | Susan Kench | Suzanne Gallagher | The Law Society Digital and Brand team | Tom Ellen | Tony Roe Solicitors | Umar Kankiya | Vanessa Friend | William Li