
Fairness in Data Science: Interdisciplinary Concepts and Challenges
Explore the complexities of fairness in data science through interdisciplinary perspectives and challenges. Delve into discrimination-aware data mining, GDPR principles, legal protection against discrimination, and historical examples of societal biases.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
What is fairness anyway? Interdisciplinary concepts and data science Bettina Berendt KU Leuven Publications and materials on privacy, privacy education, discrimination, and ethics at https://people.cs.kuleuven.be/~bettina.berendt/
Terminological note As far as I can see, fairness is overloaded in the areas I m involved in E.g. fair and transparent processing as core data-protection principle in GDPR I therefore prefer to use discrimination ~ in the sense of the fundamental right (not to be discriminated against)
Plan Background Data mining and discrimination Approach Data mining against discrimination 5 Challenges Limitations of the algorithm-centric / informatics- only approach
Background: Data mining Discrimination
Data mining is discrimination! Has no checking account? Savings < 10 000? no no ? no ? no
You may not ... vs. you may ... In many areas, including Labour Loans Insurance The protected-by-law grounds differ by area, but usually include gender, disability, age and sexual orientation, cultural, religious and linguistic beliefs/affiliation
You may no longer ... European Court of Justice (2011) Case C-236/09, Association Belge des Consommateurs Test-Achats ASBL and Others v Conseil des ministres: the use of sex as an actuarial factor should not result in differences in individuals premiums and benefits. General Data Protection Regulation (EU law as of 2018) Restrictions on fully automated decision-making and profiling Article 22 (~ Article 15 of the EU Privacy Directive 1995( Various recitals, among others: 71: the data controller should take measures to [prevent] discriminatory effects on natural persons on the basis of racial or ethnic origin, political opinion, religion or beliefs, trade union membership, genetic or health status or sexual orientation Historical examples: only { rich | white | male } people get to vote
Approach: Discrimination-aware data mining (Pedreschi, Ruggieri, & Turini, 2008; & many others since)
Beyond classical DADM (Berendt & Preibusch, 2014 ff.) Decisions: algorithms in a decision context Human Organisational Wider systems Cf. EU restrictions on fully automated decision In general: due process always involves human judgment Conceptual challenges
Beyond classical DADM (Berendt & Preibusch, 2014) Decisions: algorithm and decision context Conceptual challenges
Challenge 1: About vicious cycles (and virtuous ones)
Why not just delete the problematic attributes? If focus is detection: Prevents detection If focus is prevention: May reproduce indirect discrimination ... and this indirect discrimination will also not be detected!
From indirect to explainable to ... What if all people in the red districts ( indirect D ) have no collateral And having no collateral is a an accepted reason? explainable discrimination (Kamiran, Zliobaite, Calders, 2013) I.e. it is ok to not give them loans? What if not getting loans leads to people not having collateral? One reason for exploratory DADM based in interactive systems!
Beyond classical DADM (Berendt & Preibusch, 2014; Berendt, B chler, & Rockwell, 2015) Decisions: algorithm and decision context Conceptual challenges
Challenge 2: Discrimination =?= not distinguishing by a given attribute
A recent example: Taddeucci and McCall vs. Italy European Court of Human Rights judgment, 30/6/2016 The case concerned a refusal by the Italian authorities to grant a residence permit to a gay couple on family grounds. The Court found in particular that the situation of Mr Taddeucci and Mr McCall, a gay couple, could not be understood as comparable to that of an unmarried heterosexual couple. As they could not marry or, at the relevant time, obtain any other form of legal recognition of their situation in Italy, they could not be classified as spouses under national law. .... Thus the Court concluded that there had been a violation of Article 14 (prohibition of discrimination) taken together with Article 8 (right to respect for private and family life) of the European Convention on Human Rights.
Challenge 3: Intersectionality
New categories; which categories? Examples: Black women (Crenshaw, 1989) Mothers (Fine, 2010) Questions: Statistics Algorithm design + user interface: constraints vs. exploration sociological When is a disadvantaged group a group? Experience and stat.s of decisions ? Which grounds do we accept? (e.g. black women within seniority-based layoff in DeGraffenreid vs. General Motors) Legal the prospect of opening the hackneyed Pandora s box (DeGraffenreid vs. General Motors - 1977) multiple/pluridimensional disadvantaging is rarely explicitly mentioned in legal rules against discrimination (Baer, Bittner & Boettsche, 2011)
Challenge 4: Categories can serve to detect, but also to perpetuate discrimination
Categories of humans can become dis-used and disappear , ex. Estates of the realm, religion (in some countries)
As computer scientists we (believe we can) define anything We then believe this is the truth We then believe we can solve it Solutionism (Morozov)
Challenge 5: We should address the causes
Remark: is the discrimination in the algorithm or in the data or in the world? (causing or perpetuating D?) Has no checking account? Savings < 10 000? no The causes of discrimination are decisions! And decisions are (among other things) based on data. no Male? . Works on building site? no no
Has no checking account? Savings < 10 000? no The causes of discrimination are decisions! no Male? (Albeit: not only and not necessarily those of data scientists) Works on building site? no no
Work to do! More AI (logic)? Focus on the analytical capabilities of algorithms?! ( exploratory DADM ) Exploratory DADM for accountability Interactive DADM for better decision-making Clarifying the role of data and algorithms and people With a view to Article 22 GDPR (restrictions on fully automated decision-making) Taking a fresh look at the question: What can we, as computer scientists, do against discrimination? in dialogue with other relevant disciplines and stakeholders?
References Pedreschi D, Ruggieri S, Turini F (2008) Discrimination-aware data mining. In: Proceedings of KDD 08, pp 560 568. ACM. http://www.di.unipi.it/~ruggieri/Papers/kdd2008.pdf (and many others by the team) Sara Hajian, Josep Domingo-Ferrer: A Methodology for Direct and Indirect Discrimination Prevention in Data Mining. IEEE Trans. Knowl. Data Eng. 25(7): 1445-1459 (2013). http://crises2-deim.urv.cat/docs/publications/journals/684.pdf Sara Hajian, Josep Domingo-Ferrer, Oriol Farr s: Generalization-based privacy preservation and discrimination prevention in data publishing and mining. Data Min. Knowl. Discov. 28(5-6): 1158-1188 (2014). http://crises2-deim.urv.cat/docs/publications/journals/813.pdf Faisal Kamiran, Toon Calders, Mykola Pechenizkiy: Discrimination Aware Decision Tree Learning. ICDM 2010: 869-874. http://wwwis.win.tue.nl/~tcalders/pubs/TR10-13.pdf Kamiran F, Zliobaite I, Calders T (2013) Quantifying explainable discrimination and removing illegal discrimination in automated decision making. Knowl Inf Syst 35(3):613 644 . http://repository.tue.nl/737123 Berendt, B. & Preibusch, S. (2014). Better decision support through exploratory discrimination-aware data mining: foundations and empirical evidence. Artificial Intelligence and Law, 22 (2), 175-209 . http://people.cs.kuleuven.be/~bettina.berendt/Papers/berendt_preibusch_2014.pdf Berendt, B., B chler, M., & Rockwell, G. (2015). Is it research or is it spying? Thinking-through ethics in Big Data AI and other knowledge sciences. K nstliche Intelligenz, 29(2), 223-232. https://people.cs.kuleuven.be/~bettina.berendt/Papers/berendt_buechler_rockwell_KUIN_2015.pdf Naudts, L. (2015). Algorithms Legal Framework. Presentation in the Privacy and Big Data course, KU Leuven. Crenshaw, K. (1989). Demarginalizing the Intersection of Race and Sex: A Black Feminist Critique of Antidiscrimination Doctrine. In: The University of Chicago Legal Forum, S. 139 167. Fine, C. (2010). Delusions of gender. The real science behind sex differences. Icon Books, London. Baer, S., Bittner, M., & G ttsche, A.L. (2011). Mehrdimensionale Diskriminierung - Begriffe, Theorien und juristische Analyse. Antidiskriminierungsstelle des Bundes. http://www.antidiskriminierungsstelle.de/SharedDocs/Downloads/DE/publikationen/Expertisen/Expertise_Mehrdimensionale_Diskriminierung_jur _Analyse.pdf?__blob=publicationFile https://en.wikipedia.org/wiki/File:Home_Owners%27_Loan_Corporation_Philadelphia_redlining_map.jpg http://www.coe.int/en/web/sogi/-/judgment-on-taddeucci-and-mccall-v-italy http://faculty.law.miami.edu/zfenton/documents/DeGraffenreidv.GM.pdf http://www.visionlearning.com/img/library/large_images/image_2555.png http://image.slidesharecdn.com/lecture14-humanvariation-140615151707-phpapp01/95/lecture-14-human-variation-45-638.jpg?cb=1402845776 http://superselected.com/wp-content/uploads/2016/07/Ieshia-Evans.jpg