Maximizing generalized linear mixed model likelihoods with an automated Monte Carlo EM algorithm

被引:349
作者
Booth, JG [1 ]
Hobert, JP [1 ]
机构
[1] Univ Florida, Dept Stat, Gainesville, FL 32611 USA
关键词
confidence ellipsoid; Hastings-Metropolis algorithm; importance sampling; Laplace approximation; Markov chain Monte Carlo method; rejection sampling; salamander data; sandwich variance estimate;
D O I
10.1111/1467-9868.00176
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Two new implementations of the EM algorithm are proposed for maximum likelihood fitting of generalized linear mixed models. Both methods use random (independent and identically distributed) sampling to construct Monte Carlo approximations at the E-step. One approach involves generating random samples from the exact conditional distribution of the random effects (given the data) by rejection sampling, using the marginal distribution as a candidate. The second method uses a multivariate t importance sampling approximation. In many applications the two methods are complementary. Rejection sampling is more efficient when sample sizes are small, whereas importance sampling is better with larger sample sizes. Monte Carte approximation using random samples allows the Monte Carlo error at each iteration to be assessed by using standard central limit theory combined with Taylor series methods. Specifically, we construct a sandwich variance estimate for the maximizer at each approximate E-step. This suggests a rule for automatically increasing the Monte Carlo sample size after iterations in which the true EM step is swamped by Monte Carlo error. In contrast, techniques for assessing Monte Carlo error have not been developed for use with alternative implementations of Monte Carlo EM algorithms utilizing Markov chain Monte Carlo E-step approximations. Three different data sets, including the infamous salamander data of McCullagh and Nelder, are used to illustrate the techniques and to compare them with the alternatives. The results show that the methods proposed can be considerably more efficient than those based on Markov chain Monte Carlo algorithms. However, the methods proposed may break down when the intractable integrals in the likelihood function are of high dimension.
引用
收藏
页码:265 / 285
页数:21
相关论文
共 48 条
[1]  
[Anonymous], 1985, Computational Statistics Quarterly, DOI DOI 10.1155/2010/874592
[2]   A RELATIVE OFFSET ORTHOGONALITY CONVERGENCE CRITERION FOR NON-LINEAR LEAST-SQUARES [J].
BATES, DM ;
WATTS, DG .
TECHNOMETRICS, 1981, 23 (02) :179-183
[3]   Standard errors of prediction in generalized linear mixed models [J].
Booth, JG ;
Hobert, JP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1998, 93 (441) :262-272
[4]  
BRESLOW NE, 1995, BIOMETRIKA, V82, P81
[5]   APPROXIMATE INFERENCE IN GENERALIZED LINEAR MIXED MODELS [J].
BRESLOW, NE ;
CLAYTON, DG .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (421) :9-25
[6]   Maximum likelihood estimation for probit-linear mixed models with correlated random effects [J].
Chan, JSK ;
Kuk, AYC .
BIOMETRICS, 1997, 53 (01) :86-97
[7]   MONTE-CARLO EM ESTIMATION FOR TIME-SERIES MODELS INVOLVING COUNTS [J].
CHAN, KS ;
LEDOLTER, J .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (429) :242-252
[8]  
CHAN KS, 1994, ANN STAT, V22, P1747, DOI 10.1214/aos/1176325754
[9]  
Cox D. R., 1988, ANAL BINARY DATA
[10]  
DEBRUIJN NG, 1981, ASYMPTOTIC METHODS A