GENERALIZED RANDOM FORESTS

被引:951
作者
Athey, Susan [1 ]
Tibshirani, Julie [2 ]
Wager, Stefan [1 ]
机构
[1] Stanford Univ, Stanford Grad Sch Business, 655 Knight Way, Stanford, CA 94305 USA
[2] Elasticsearch BV, 800 West El Camino Real,Suite 350, Mountain View, CA 94040 USA
关键词
Asymptotic theory; causal inference; instrumental variable; INSTRUMENTAL VARIABLE ESTIMATION; PARAMETER INSTABILITY; STRUCTURAL-CHANGE; REGRESSION; TESTS; MODELS; CONSISTENCY; JACKKNIFE; CONSTANCY; VARIANCE;
D O I
10.1214/18-AOS1709
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We propose generalized random forests, a method for nonparametric statistical estimation based on random forests (Breiman [Mach. Learn. 45 (2001) 5-32]) that can be used to fit any quantity of interest identified as the solution to a set of local moment equations. Following the literature on local maximum likelihood estimation, our method considers a weighted set of nearby training examples; however, instead of using classical kernel weighting functions that are prone to a strong curse of dimensionality, we use an adaptive weighting function derived from a forest designed to express heterogeneity in the specified quantity of interest. We propose a flexible, computationally efficient algorithm for growing generalized random forests, develop a large sample theory for our method showing that our estimates are consistent and asymptotically Gaussian and provide an estimator for their asymptotic variance that enables valid confidence intervals. We use our approach to develop new methods for three statistical tasks: nonparametric quantile regression, conditional average partial effect estimation and heterogeneous treatment effect estimation via instrumental variables. A software implementation, grf for R and C++, is available from CRAN.
引用
收藏
页码:1148 / 1178
页数:31
相关论文
共 81 条
[11]  
[Anonymous], 1998, ASYMPTOTIC STAT, DOI DOI 10.1017/CBO9780511802256
[12]  
[Anonymous], 2009, ELEMENTS STAT LEARNI, DOI DOI 10.1007/978-0-387-84858-7
[13]  
[Anonymous], ANN STAT
[14]   Recursive partitioning for heterogeneous causal effects [J].
Athey, Susan ;
Imbens, Guido .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (27) :7353-7360
[15]   Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain [J].
Belloni, A. ;
Chen, D. ;
Chernozhukov, V. ;
Hansen, C. .
ECONOMETRICA, 2012, 80 (06) :2369-2429
[16]  
Beygelzimer A, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P129
[17]   A random forest guided tour [J].
Biau, Gerard ;
Scornet, Erwan .
TEST, 2016, 25 (02) :197-227
[18]  
Biau G, 2012, J MACH LEARN RES, V13, P1063
[19]   On the layered nearest neighbour estimate, the bagged nearest neighbour estimate and the random forest method in regression and classification [J].
Biau, Gerard ;
Devroye, Luc .
JOURNAL OF MULTIVARIATE ANALYSIS, 2010, 101 (10) :2499-2518
[20]  
Biau G, 2008, J MACH LEARN RES, V9, P2015