RENFORCE Blog

Europese kernwaarden

Reflection on the GDS webinar by Sandra Wachter: ‘The (im)possibility of algorithmic fairness’

Machiko Kanetake, Lucky Belder and Karin van Es

 class=
© iStockphoto.com/PayPau

What regulatory frameworks does the EU have to detect and rectify biased algorithms? Unfortunately, some of the celebrated legal frameworks in the EU on data protection and non-discrimination do not seem to be fit for purpose in the age of automated decision-making, as Sandra Wachter elucidated in her Utrecht University webinar on 26 January 2021 hosted by the Special Interest Group ‘Principles by Design: Towards Good Data Practice’.

Data-based ‘inferences’ and assumptions that matter

In her eloquent and engaging webinar, Sandra Wachter focused on the contextuality of non-discrimination law and how it affects the identification of possible discriminatory practices in the age of automation. As Wachter pointed out during the lecture, our daily life is already profoundly affected by algorithmic decision-making used by public and private entities. Examples are abundant and diverse. Algorithmic profiling can influence decisions on job and loan applications, university admission, or criminal sentences, just to name a few.

While the collection of data that serve the basis for algorithmic decision-making can be problematic, we must realise the fact that certain ‘inferences’ and assumptions are drawn about individuals or groups of individuals. The data-based inferences can lead to what Wachter terms ‘discrimination by association’, whereby inaccurate or unwanted interests are attributed to individuals even beyond the circle of legally protected vulnerable groups. Unless such inferences are subject to control by data subjects, the legal framework on data protection would fail to identify, much less prevent, the problematic use of personal data.

Contextuality in privacy and non-discrimination law

In this sense, one of the critical questions pertains to whether, and to what extent, data-based inferences qualify as ‘personal data’ under the EU’s data protection law, notably the General Data Protection Regulation (GDPR). Sandra Wachter and Brent Mittelstadt define inferences as ‘information relating to an identified or identifiable natural person created through deduction or reasoning rather than mere observation or collection from the data subject’ (‘A Right to Reasonable Inferences’, Columbia Business Law Review (2019), vol. 2, p. 515).

Not surprisingly, some inferences are privacy-invasive or harmful to one’s reputation. Despite the impact of such inferences, the GDPR and related jurisprudence do not (yet) give us robust recourse against unreasonable inferences, as pointed out by Wachter during the webinar. Furthermore, even if certain inferences can be considered personal data, the GDPR does not serve as a robust framework to assess how one’s data is being evaluated and what kind of assumptions or predictions are made with regard to one’s behaviour.

The impact of algorithmic inferences however goes deeper than the sheer consequences of automated decisions on individuals. As Wachter highlighted during the webinar, it is our ability to detect possible cases of discrimination that has profoundly been affected. While the EU has well-developed non-discrimination law, it does not provide a useful tool to identify possible discriminatory practices in automated systems. This is precisely due to the ‘contextuality’ of law and judicial scrutiny, as Wachter pointed out.

The assessment of indirect discrimination under EU non-discrimination law is, perhaps necessarily, contextual, preserving space for judges’ case-by-case and partly intuitive assessment. While such contextuality has merit, ‘the law does not provide a static or homogenous framework suited to testing for discrimination in AI systems’, as argued by Wachter, Brent Mittelstadt, and Chris Russell (‘Why Fairness Cannot be Automated: Bridging the Gap between EU Non-Discrimination Law and AI’ (2020), p. 7).

Statistics-based bias tests

 class=
Photo: Tingey (Unsplash)

To overcome undetected injustice—whose scale cannot readily be known—Wachter discussed various proposals, including the use of statistics-based bias tests. In her co-authored article, Wachter elaborates upon the ‘conditional demographic disparity’ test as a ‘baseline for evidence to ensure a consistent procedure for assessment (but not interpretation) across cases involving potential discrimination caused by automated systems’ (‘Why Fairness Cannot be Automated’, p. 6, emphasis added). The demographic parity asserts that ‘the proportion of people with a protected attribute should be the same in the advantaged and disadvantaged group’ (ibid., p. 50, emphasis added).

In providing the demographic disparity test, Wachter referred to paragraph 59 of Seymour-Smith (1999) of the Court of Justice, in which the Court argued that the disadvantaged group and the advantaged group should be compared with regard to the use of statistics in discrimination cases. The Court observed that such a comparison is ‘the best approach’:

‘the best approach to the comparison of statistics is to consider, on the one hand, the respective proportions of men in the workforce able to satisfy the requirement of two years’ employment under the disputed rule and of those unable to do so, and, on the other, to compare those proportions as regards women in the workforce’ (Seymour-Smith, Case C-167/97, para. 59).

In line with the Court of Justice’s approach in Seymour-Smith and related jurisprudence, Wachter and her co-authors put forward the conditional demographic disparity test as a ‘minimal standard for statistical evidence in non-discrimination cases addressing automated systems’ (‘Why Fairness Cannot be Automated’, p. 54). What matters is that legal experts should work with and draw inspiration from statistical approaches in detecting disparity and cases of discrimination.

Algorithmic fairness and the ‘Toeslagenaffaire’

Sandra Wachter’s webinar took place several weeks after the report on the ‘Toeslagenaffaire’ scandal (Dutch childcare benefit scandal) in the Netherlands concerning the Tax and Customs Administration which wrongly accused over 26,000 parents of fraudulent claims over childcare benefits. Here Dutch citizens with a dual nationality were found to have been monitored more closely. In the report, the investigators found the procedures by the Tax and Customs Administration to be ‘discriminatory’.

The scandal demonstrated, in part, the highly problematic outcome of the lack of appropriate control over algorithmic inferences. While the incident has many different dimensions, the relevant institutions in our democratic society were seemingly blind as to the potential harm that may be inflicted on individual citizens by relying on automated systems. In order to prevent this kind of injustice, it is crucial to employ a consistent assessment procedure that may uncover automated discrimination. The procedure should be part of the design of the program architecture.

At the same time, this assessment tool should enable but not replace judicial review of decisions related to individuals. Some tools suggested by Wachter, including the conditional demographic disparity test, can make a critical difference in remedying the problematic aspects of automated decision-making, both in public and private institutions. They allow processing data at scale whilst helping with assessment without losing sight of the contextual aspects of fairness.

This blog has been written by the SIG coordinators and, therefore, any errors are theirs alone and not those of the speaker of the online seminar.