A Beam-Search Based Method to Select Classification and Imputation Methods for Fair and Accurate Data Analysis

Date

2024-12

Department

Program

Citation of Original Publication

Mowoh, Dodavah, and Zhiyuan Chen. "A Beam-Search Based Method to Select Classification and Imputation Methods for Fair and Accurate Data Analysis." In 2024 IEEE International Conference on Big Data (BigData), 5281?88, 2024. https://doi.org/10.1109/BigData62323.2024.10825524.

Rights

© 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Abstract

Members from disadvantaged or minority groups are often more likely to have missing values in their record. Imputation is a common approach to deal with missing values before the data is being analyzed. Several studies have found interplay of imputation methods and classification methods with respect to accuracy and fairness: different combinations of imputation and classification methods will lead to different accuracy and fairness results. However, it is unclear how to choose the combination of imputation method and classification method to optimize the tradeoff between accuracy and fairness. An exhaustive search approach will be too expensive because it needs to check all combinations and measure both accuracy and fairness for every combination. This paper proposes a beam-search based method to select the optimal combination of imputation methods and classification methods. An empirical study was also conducted to compare the performance of the proposed method to exhaustive search. The proposed solution achieves the same result as the exhaustive search method but with much lower search cost.