The Road to Explainability is Paved with Bias: Measuring the Fairness of Explanations

被引：39

作者：

Balagopalan, Aparna ^{[1
]}

Zhang, Haoran ^{[1
]}

Hamidieh, Kimia ^{[2
]}

Hartvigsen, Thomas ^{[1
]}

Rudzicz, Frank ^{[2
]}

Ghassemi, Marzyeh ^{[1
]}

机构：

[1] MIT, Cambridge, MA 02139 USA

[2] Univ Toronto, Vector Inst, Toronto, ON, Canada

来源：

PROCEEDINGS OF 2022 5TH ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2022 | 2022年

关键词：

explainability; machine learning; fairness; MACHINE; DECISIONS; IMPACT;

D O I：

10.1145/3531146.3533179

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Machine learning models in safety-critical settings like healthcare are often "blackboxes": they contain a large number of parameters which are not transparent to users. Post-hoc explainability methods where a simple, human-interpretable model imitates the behavior of these blackbox models are often proposed to help users trust model predictions. In this work, we audit the quality of such explanations for different protected subgroups using real data from four settings in finance, healthcare, college admissions, and the US justice system. Across two different blackbox model architectures and four popular explainability methods, we find that the approximation quality of explanation models, also known as the fidelity, differs significantly between subgroups. We also demonstrate that pairing explainability methods with recent advances in robust machine learning can improve explanation fairness in some settings. However, we highlight the importance of communicating details of non-zero fidelity gaps to users, since a single solution might not exist across all settings. Finally, we discuss the implications of unfair explanation models as a challenging and understudied problem facing the machine learning community.

引用

页码：1194 / 1206

页数：13

共 123 条

[1]

Adragna R, 2020, Arxiv, DOI [arXiv:2011.06485, 10.48550/arXiv.2011.06485]

[2]

Agarwal Alekh, 2018, P MACHINE LEARNING R, V80

[3]

Ahmad MA, 2018, ACM-BCB'18: PROCEEDINGS OF THE 2018 ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, P559, DOI [10.1145/3233547.3233667, 10.1109/ICHI.2018.00095]

[4]

Aïvodji U, 2019, PR MACH LEARN RES, V97

[5]

Aivodji Ulrich, 2021, arXiv

[6]

Alarie B, 2016, SSRN Electronic Journal, DOI 10.2139/ssrn.2855977

[7]

Anahideh H, 2021, Arxiv, DOI arXiv:2001.01796

[8] On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation [J].

Bach, Sebastian ;

Binder, Alexander ;

Montavon, Gregoire ;

Klauschen, Frederick ;

Mueller, Klaus-Robert ;

Samek, Wojciech .

PLOS ONE, 2015, 10 (07)

[9]

Bansal Gagan., 2021, P 2021 CHI C HUM FAC

[10]

Bansal Gagan, 2021, Optimizing AI for Teamwork

← 1 2 3 4 5 6 7 8 9 10 →