Intersectional Fairness in Machine Learning: Measurements, Algorithms, and Applications

Author/Creator

Author/Creator ORCID

Date

2022-01-01

Department

Information Systems

Program

Information Systems

Citation of Original Publication

Rights

Access limited to the UMBC community. Item may possibly be obtained via Interlibrary Loan through a local library, pending author/copyright holder's permission.
This item may be protected under Title 17 of the U.S. Copyright Law. It is made available by UMBC for non-commercial research and education. For permission to publish or reproduce, please see http://aok.lib.umbc.edu/specoll/repro.php or contact Special Collections at speccoll(at)umbc.edu

Abstract

With the increasing impact of machine learning (ML) algorithms on many facets of life, there are growing concerns that biases inherent in data can lead the behavior of these algorithms to discriminate against certain populations, e.g. based on race, gender, and so on. A number of studies have subsequently demonstrated that bias and fairness issues in ML are both harmful and pervasive. This thesis makes several advances for fairness in ML by addressing fundamental challenges and by developing technical solutions for real-world applications. Our first contribution is to propose definitions of fairness in ML systems which are informed by the framework of intersectionality, a critical lens from the legal, social science, and humanities literature that analyzes how interlocking systems of power and oppression affect individuals along overlapping dimensions. However, the measurement of fairness becomes statistically challenging in the intersectional setting due to data sparsity. To address this, we present Bayesian probabilistic modeling approaches for the reliable estimation of intersectional fairness. We then enforce our fairness criteria in supervised learning algorithms using a stochastic approximation-based approach that scales to big data. Unlike traditional fairness research, we also conversely focus on unsupervised learning and develop a fair inference technique using our stochastic approach for probabilistic graphical models. To demonstrate the generality of our fundamental methods and their potential for important societal impact, this thesis also presents a number of real-world applications for intersectional fair ML methods. Motivated by the ProPublica report in 2016 that alleged significant biases in a widely used AI-based system across the U.S. to predict a defendant's risk of re-offending, we build a special-purpose graphical model for criminal justice risk assessments and use our fairness approach to prevent the inferences from encoding unfair biases. Continuing the theme of fairness practice in real-world applications, we develop a neural fair collaborative filtering framework for mitigating discrimination in academic major and career recommendations. Furthermore, one of the major barriers against the deployment of fairness-preserving applications is the conventional wisdom that fairness brings a cost in predictive performance which could affect an organization's bottom-line. We systematically study the behavior of our fair learning algorithms and demonstrate that it is possible to improve fairness to some degree without sacrificing the predictive performance via a sensible hyper-parameter selection strategy. To show the utility of our approaches throughout the thesis, we conduct extensive experiments on the census, criminal recidivism, hospitalizations, social media, banking, and loan application datasets. Our results reveal a pathway toward increasing the deployment of fair ML methods, with potentially substantial positive societal impacts.