Causal Interpretations of Black-Box Models

被引:284
作者
Zhao, Qingyuan [1 ]
Hastie, Trevor [2 ]
机构
[1] Univ Penn, Dept Stat, 400 Huntsman Hall,3730 Walnut St, Philadelphia, PA 19104 USA
[2] Stanford Univ, Dept Stat, Sequoia Hall,390 Serra Mall, Stanford, CA 94305 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
Back-door adjustment; Data visualization; Machine learning; Mediation analysis; Partial dependence plot; INFERENCE; REGRESSION;
D O I
10.1080/07350015.2019.1624293
中图分类号
F [经济];
学科分类号
02 ;
摘要
The fields of machine learning and causal inference have developed many concepts, tools, and theory that are potentially useful for each other. Through exploring the possibility of extracting causal interpretations from black-box machine-trained models, we briefly review the languages and concepts in causal inference that may be interesting to machine learning researchers. We start with the curious observation that Friedman's partial dependence plot has exactly the same formula as Pearl's back-door adjustment and discuss three requirements to make causal interpretations: a model with good predictive performance, some domain knowledge in the form of a causal diagram and suitable visualization tools. We provide several illustrative examples and find some interesting and potentially causal relations using visualization tools for black-box models.
引用
收藏
页码:272 / 281
页数:10
相关论文
共 49 条
[1]  
ATHEY S, 2019, ARXIV190310075
[2]   Roads, deforestation, and the mitigating effect of protected areas in the Amazon [J].
Barber, Christopher P. ;
Cochrane, Mark A. ;
Souza, Carlos M., Jr. ;
Laurance, William F. .
BIOLOGICAL CONSERVATION, 2014, 177 :203-209
[3]  
Bollen K. A., 2013, HDB CAUSAL ANAL SOCI, P301, DOI DOI 10.1007/978-94-007-6094-3_15
[4]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[5]   Statistical modeling: The two cultures [J].
Breiman, L .
STATISTICAL SCIENCE, 2001, 16 (03) :199-215
[6]   Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission [J].
Caruana, Rich ;
Lou, Yin ;
Gehrke, Johannes ;
Koch, Paul ;
Sturm, Marc ;
Elhadad, Noemie .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :1721-1730
[7]   Double/debiased machine learning for treatment and structural parameters [J].
Chernozhukov, Victor ;
Chetverikov, Denis ;
Demirer, Mert ;
Duflo, Esther ;
Hansen, Christian ;
Newey, Whitney ;
Robins, James .
ECONOMETRICS JOURNAL, 2018, 21 (01) :C1-C68
[8]   Hydrological prediction in a non-stationary world [J].
Clarke, Robin T. .
HYDROLOGY AND EARTH SYSTEM SCIENCES, 2007, 11 (01) :408-414
[9]   Automated versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition [J].
Dorie, Vincent ;
Hill, Jennifer ;
Shalit, Uri ;
Scott, Marc ;
Cervone, Dan .
STATISTICAL SCIENCE, 2019, 34 (01) :43-68
[10]   A Proactive Intelligent Decision Support System for Predicting the Popularity of Online News [J].
Fernandes, Kelwin ;
Vinagre, Pedro ;
Cortez, Paulo .
PROGRESS IN ARTIFICIAL INTELLIGENCE-BK, 2015, 9273 :535-546