American Journal of Applied Mathematics and Statistics
ISSN (Print): 2328-7306 ISSN (Online): 2328-7292 Website: Editor-in-chief: Mohamed Seddeek
Open Access
Journal Browser
American Journal of Applied Mathematics and Statistics. 2015, 3(6), 243-251
DOI: 10.12691/ajams-3-6-5
Open AccessArticle

Modification of the Sandwich Estimator in Generalized Estimating Equations with Correlated Binary Outcomes in Rare Event and Small Sample Settings

Paul Rogers1 and Julie Stoner2,

1Department of Numerical Sciences, Civil Aerospace Medical Institute, Oklahoma City, Ok, USA

2Department of Biostatistics and Epidemiology, College of Public Health, University of Oklahoma Health Sciences Center, Oklahoma City, Ok. USA

Pub. Date: November 23, 2015

Cite this paper:
Paul Rogers and Julie Stoner. Modification of the Sandwich Estimator in Generalized Estimating Equations with Correlated Binary Outcomes in Rare Event and Small Sample Settings. American Journal of Applied Mathematics and Statistics. 2015; 3(6):243-251. doi: 10.12691/ajams-3-6-5


Regression models for correlated binary outcomes are commonly fit using a Generalized Estimating Equations (GEE) methodology. GEE uses the Liang and Zeger sandwich estimator to produce unbiased standard error estimators for regression coefficients in large sample settings even when the covariance structure is misspecified. The sandwich estimator performs optimally in balanced designs when the number of participants is large, and there are few repeated measurements. The sandwich estimator is not without drawbacks; its asymptotic properties do not hold in small sample settings. In these situations, the sandwich estimator is biased downwards, underestimating the variances. In this project, a modified form for the sandwich estimator is proposed to correct this deficiency. The performance of this new sandwich estimator is compared to the traditional Liang and Zeger estimator as well as alternative forms proposed by Morel, Pan and Mancl and DeRouen. The performance of each estimator was assessed with 95% coverage probabilities for the regression coefficient estimators using simulated data under various combinations of sample sizes and outcome prevalence values with an Independence (IND), Autoregressive (AR) and Compound Symmetry (CS) correlation structure. This research is motivated by investigations involving rare-event outcomes in aviation data.

sandwich estimator generalized estimating equation rare event finite sample binary outcome correlated outcome

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit


[1]  Liang, K.-Y. and S.L. Zeger, “Longitudinal Data Analysis Using Generalized Linear Models”,Biometrika, 1986. 73(1): p. 13-22.
[2]  Carroll, R.J., Wang, S., Simpson, D. G., Stromberg, A. J., and Ruppert, D.,The Sandwich (Robust Covariance Matrix) Estimator. 1998; Available from:
[3]  Fitzmaurice, G.M., N.M. Laird, and J.H. Ware, Applied Longitudinal Analysis. Wiley series in probability and statistics. 2004, Hoboken, N.J.: Wiley-Interscience. 506 p.
[4]  Mancl, L.A. and T.A. DeRouen, “A Covariance Estimator for GEE with Improved Small-Sample Properties”,Biometrics, 2001. 57(1): p. 126-134.
[5]  Pan, W., “On the Robust Variance Estimator in Generalised Estimating Equations”,Biometrika, 2001. 88(3): p. 901-906.
[6]  King, G. and L. Zeng, “Logistic Regression in Rare Events Data”,Political Analysis, 2001. 9: p. 137-163.
[7]  Gunsolley, J.C., C. Getchell, and V.M. Chinchilli, “Small Sample Characteristics of Generalized Estimating Equations”, Communications in Statistics: Simulation and Computation, 1995. 24: p. 869-78.
[8]  Morel, J.G., “Logistic Regression Under Complex Survey Designs”,Survey Methodology, 1989. 15(2): p. 203-223.
[9]  Morel, J.G., M.C. Bokossa, and N.K. Neerchal, “Small Sample Correction for the Variance of GEE Estimators”,Biometrical Journal, 2003. 45(4): p. 395-409.
[10]  Johnson, R.A. and D.W. Wichern, Applied Multivariate Statistical Analysis. 6th ed. 2007, Upper Saddle River, N.J.: Pearson Prentice Hall. 773 p.
[11]  Qaqish, B.F., “A Family of Multivariate Binary Distributions for Simulating Correlated Binary Variables with Specified Marginal Means and Correlations”,Biometrika, 2003. 92: p. 455-463.
[12]  Peterman, C.L., Rogers, P. B., Veronneau, S. J. H., and Whinnery, J. E., “Development of an Aeromedical Scientific Information System for Aviation Safety”, Office of Aerospace Medicine 2008.Report No. DOT/FAA/AM-08/01.