Missing-data methods for generalized linear models: A comparative review

被引:318
作者
Ibrahim, JG [1 ]
Chen, MH
Lipsitz, SR
Herring, AH
机构
[1] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
[2] Univ Connecticut, Dept Stat, Storrs, CT 06269 USA
[3] Med Univ S Carolina, Dept Biometry & Epidemiol, Charleston, SC 29425 USA
基金
美国国家卫生研究院;
关键词
EM algorithm; generalized linear model; Gibbs sampling; maximum likelihood; missing at random; multiple imputation; nonignorable missing data; posterior distribution; weighted estimating equation;
D O I
10.1198/016214504000001844
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Missing data is a major issue in many applied problems, especially in the biomedical sciences. We review four common approaches for inference in generalized linear models (GLMs) with missing covariate data: maximum likelihood (ML), multiple imputation (MI), fully Bayesian (FB), and weighted estimating equations (WEEs). There is considerable interest in how these four methodologies are related, the properties of each approach, the advantages and disadvantages of each methodology, and computational implementation. We examine data that are missing at random and nonignorable missing. For ML we focus on techniques using the EM algorithm, and in particular, discuss the EM by the method of weights and related procedures as discussed by Ibrahim. For MI, we examine the techniques developed by Rubin. For FB, we review approaches considered by Ibrahim et al. For WEE, we focus on the techniques developed by Robins et al. We use a real dataset and a detailed simulation study to compare the four methods.
引用
收藏
页码:332 / 346
页数:15
相关论文
共 89 条
[41]   Semiparametric methods for response-selective and missing data problems in regression [J].
Lawless, JF ;
Kalbfleisch, JD ;
Wild, CJ .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1999, 61 :413-438
[42]  
Leong T, 1999, STAT MED, V18, P473, DOI 10.1002/(SICI)1097-0258(19990228)18:4<473::AID-SIM21>3.0.CO
[43]  
2-H
[44]   Incomplete covariates in the Cox model with applications to biological marker data [J].
Leong, T ;
Lipsitz, SR ;
Ibrahim, JG .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2001, 50 :467-484
[45]  
Lipsitz S R, 2000, Biostatistics, V1, P315, DOI 10.1093/biostatistics/1.3.315
[46]   A weighted estimating equation for missing covariate data with properties similar to maximum likelihood [J].
Lipsitz, SR ;
Ibrahim, JG ;
Zhao, LP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1999, 94 (448) :1147-1160
[47]  
Lipsitz SR, 1999, STAT MED, V18, P2435, DOI 10.1002/(SICI)1097-0258(19990915/30)18:17/18<2435::AID-SIM267>3.0.CO
[48]  
2-B
[49]   Likelihood methods for incomplete longitudinal binary responses with incomplete categorical covariates [J].
Lipsitz, SR ;
Ibrahim, JG ;
Fitzmaurice, GM .
BIOMETRICS, 1999, 55 (01) :214-223
[50]   Estimating equations with Incomplete categorical covariates in the Cox model [J].
Lipsitz, SR ;
Ibrahim, JG .
BIOMETRICS, 1998, 54 (03) :1002-1013