Impact of Imputation Strategies on Fairness in Machine Learning

Simon Caton, Saiteja Malisetty, Christian Haas

Publikation: Wissenschaftliche FachzeitschriftOriginalbeitrag in FachzeitschriftBegutachtung

Abstract

Research on Fairness and Bias Mitigation in Machine Learning often uses a set of reference datasets for the design and evaluation of novel approaches or definitions. While these datasets are well structured and useful for the comparison of various approaches, they do not reflect that datasets commonly used in real-world applications can have missing values. When such missing values are encountered, the use of imputation strategies is commonplace. However, as imputation strategies potentially alter the distribution of data they can also affect the performance, and potentially the fairness, of the resulting predictions, a topic not yet well understood in the fairness literature. In this article, we investigate the impact of different imputation strategies on classical performance and fairness in classification settings. We find that the selected imputation strategy, along with other factors including the type of classification algorithm, can significantly affect performance and fairness outcomes. The results of our experiments indicate that the choice of imputation strategy is an important factor when considering fairness in Machine Learning. We also provide some insights and guidance for researchers to help navigate imputation approaches for fairness.
OriginalspracheEnglisch
Seiten (von - bis)1011 - 1035
FachzeitschriftJournal of Artificial Intelligence Research
Jahrgang74
DOIs
PublikationsstatusVeröffentlicht - 2022

Österreichische Systematik der Wissenschaftszweige (ÖFOS)

  • 102001 Artificial Intelligence
  • 102

Zitat