High-dimensional variable screening and bias in subsequent inference, with an empirical comparison

被引:30
作者
Buehlmann, Peter [1 ]
Mandozzi, Jacopo [1 ]
机构
[1] ETH, Seminar Stat, Zurich, Switzerland
关键词
Elastic net; Lasso; Linear model; Ridge; Sparsity; Sure independence screening; Variable selection; GENERALIZED LINEAR-MODELS; DANTZIG SELECTOR; LASSO; SPARSITY; REGULARIZATION; RECOVERY;
D O I
10.1007/s00180-013-0436-3
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We review variable selection and variable screening in high-dimensional linear models. Thereby, a major focus is an empirical comparison of various estimation methods with respect to true and false positive selection rates based on 128 different sparse scenarios from semi-real data (real data covariables but synthetic regression coefficients and noise). Furthermore, we present some theoretical bounds for the bias in subsequent least squares estimation, using the selected variables from the first stage, which have direct implications for construction of p-values for regression coefficients.
引用
收藏
页码:407 / 430
页数:24
相关论文
共 34 条
[1]   Sufficient dimension reduction and prediction in regression [J].
Adragni, Kofi P. ;
Cook, R. Dennis .
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2009, 367 (1906) :4385-4405
[2]   SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR [J].
Bickel, Peter J. ;
Ritov, Ya'acov ;
Tsybakov, Alexandre B. .
ANNALS OF STATISTICS, 2009, 37 (04) :1705-1732
[3]  
Bühlmann P, 2011, SPRINGER SER STAT, P1, DOI 10.1007/978-3-642-20192-9
[4]  
Buhlmann P, 2013, ANN REV STA IN PRESS
[5]  
Buhlmann P, 2012, STAT SIGNIF IN PRESS
[6]   Sparsity oracle inequalities for the Lasso [J].
Bunea, Florentina ;
Tsybakov, Alexandre ;
Wegkamp, Marten .
ELECTRONIC JOURNAL OF STATISTICS, 2007, 1 :169-194
[7]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[8]   BagBoosting for tumor classification with gene expression data [J].
Dettling, M .
BIOINFORMATICS, 2004, 20 (18) :3583-3593
[9]   Sure independence screening for ultrahigh dimensional feature space [J].
Fan, Jianqing ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :849-883
[10]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360