Automated versus Do-It-Yourself Methods for Causal Inference: Lessons Learned from a Data Analysis Competition

被引:163
作者
Dorie, Vincent [1 ]
Hill, Jennifer [2 ]
Shalit, Uri [3 ]
Scott, Marc [4 ]
Cervone, Dan [5 ]
机构
[1] Columbia Univ, Data Sci Inst, 475 Riverside Dr,Room 320L, New York, NY 10115 USA
[2] NYU, Dept Appl Stat Social Sci & Humanities, Appl Stat & Data Sci, 246 Greene St,3rd Floor, New York, NY 10003 USA
[3] Technion Israel Inst Technol, Fac Ind Engn & Management, IL-3200003 Haifa, Israel
[4] NYU, Dept Appl Stat, Appl Stat, 246 Greene St,3rd Floor, New York, NY 10003 USA
[5] Los Angeles Dodgers, Quantitat Res, 1000 Vin Scully Ave, Los Angeles, CA 90012 USA
关键词
Causal inference; competition; machine learning; automated algorithms; evaluation; PROPENSITY-SCORE; ENSEMBLE METHODS; MISSING-DATA; REGRESSION; MODELS; STATISTICS; ADJUSTMENT; VARIABLES; SELECTION; SUPPORT;
D O I
10.1214/18-STS667
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Statisticians have made great progress in creating methods that reduce our reliance on parametric assumptions. However, this explosion in research has resulted in a breadth of inferential strategies that both create opportunities for more reliable inference as well as complicate the choices that an applied researcher has to make and defend. Relatedly, researchers advocating for new methods typically compare their method to at best 2 or 3 other causal inference strategies and test using simulations that may or may not be designed to equally tease out flaws in all the competing methods. The causal inference data analysis challenge, "Is Your SATT Where It's At?", launched as part of the 2016 Atlantic Causal Inference Conference, sought to make progress with respect to both of these issues. The researchers creating the data testing grounds were distinct from the researchers submitting methods whose efficacy would be evaluated. Results from 30 competitors across the two versions of the competition (black-box algorithms and do-it-yourself analyses) are presented along with post-hoc analyses that reveal information about the characteristics of causal inference strategies and settings that affect performance. The most consistent conclusion was that methods that flexibly model the response surface perform better overall than methods that fail to do so. Finally new methods are proposed that combine features of several of the top-performing submitted methods.
引用
收藏
页码:43 / 68
页数:26
相关论文
共 75 条
[1]   Large sample properties of matching estimators for average treatment effects [J].
Abadie, A ;
Imbens, GW .
ECONOMETRICA, 2006, 74 (01) :235-267
[2]  
[Anonymous], 2003, SPR S STAT
[3]  
[Anonymous], 2004, Applied Bayesian Modeling and Causal Inference from Incomplete-Data Perspectives: An Essential Journey with Donald Rubin's Statistical Family
[4]  
[Anonymous], 2002, Observational Studies
[5]  
[Anonymous], ZEW EC STUDIES
[6]  
[Anonymous], 2016, H2O R INT H2O R PACK
[7]  
[Anonymous], 2006, Gaussian processes in machine learning
[8]  
[Anonymous], 2010, P 26 C UNCERTAINTY A
[9]  
[Anonymous], 2009, TOPLINES HETER UNPUB
[10]   The value of feedback in forecasting competitions [J].
Athanasopoulos, George ;
Hyndman, Rob J. .
INTERNATIONAL JOURNAL OF FORECASTING, 2011, 27 (03) :845-849