American Journal of Applied Mathematics and Statistics
ISSN (Print): 2328-7306 ISSN (Online): 2328-7292 Website: https://www.sciepub.com/journal/ajams Editor-in-chief: Mohamed Seddeek
Open Access
Journal Browser
Go
American Journal of Applied Mathematics and Statistics. 2021, 9(1), 28-37
DOI: 10.12691/ajams-9-1-5
Open AccessArticle

Effects of Random Sampling Methods on Maximum Likelihood Estimates of a Simple Logistic Regression Model

Oshada Senaweera1, 2, , Prasanna S. Haddela1 and Gayan Dharmarathne2

1Department of Information Technology, Sri Lanka Institute of Information Technology, Malabe, Sri Lanka

2Department of Statistics, University of Colombo, Colombo, Sri Lanka

Pub. Date: January 31, 2021

Cite this paper:
Oshada Senaweera, Prasanna S. Haddela and Gayan Dharmarathne. Effects of Random Sampling Methods on Maximum Likelihood Estimates of a Simple Logistic Regression Model. American Journal of Applied Mathematics and Statistics. 2021; 9(1):28-37. doi: 10.12691/ajams-9-1-5

Abstract

The paper investigates the comparative effects of several random sampling methods on the maximum likelihood estimates of a simple logistic regression model. The study uses simulated data (logistic populations with pre-defined parameter values) that used Monte Carlo methods to simulate. Sampling techniques include Simple Random Sampling (SRS) and six variations of Stratified Sampling where two are single-stage Stratified Sampling and four are choice-based (two-phase) Stratified Sampling. Parameter estimates arising under each sampling technique were compared using performance measures Bias, Standard Error & Percentage of models that are feasibly estimated. The simulation-based analysis found that choice-based sampling with proportional allocation in both phases is the best-suited sampling technique for parameter estimation of a simple logistic regression model.

Keywords:
Monte-Carlo simulations random sampling logistic regression maximum likelihood estimates

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

References:

[1]  Amemiya, T., “The n-2-Order Mean Squared Errors of the Maximum Likelihood and the Minimum Logit Chi-Square Estimator”, The Annals of Statistics, 8 (3), 488-505, 1980.
 
[2]  Gordon, D.V., Lin, Z., Osberg, L. and Phipps, S., “Predicting Probabilities: Inherent and Sampling Variability in the Estimation of Discrete-Choice Models”, Oxford Bulletin of Economics and Statistics, 56 (1), 13-31, 1994.
 
[3]  Whittemore, A.S., “Sample Size for Logistic Regression with Small Response Probability”, Journal of the American Statistical Association, 76 (373), 27-32, 1981.
 
[4]  Hsieh, F.Y., “Sample size tables for logistic regression”, Statistics in medicine, 8 (7), 795-802, 1989.
 
[5]  Breslow, N. E., and Chatterjee, N., “Design and analysis of two‐phase studies with binary outcome applied to Wilms tumour prognosis”, Journal of the Royal Statistical Society: Series C (Applied Statistics), 48 (4), 457-468, 1999.
 
[6]  Giles, J. A., and Courchane, M. J., “Stratified sample design for fair lending binary logit models”, Department of Economics, University of Victoria, 2000.
 
[7]  Dietrich, J., “The effects of sampling strategies on the small sample properties of the logit estimator”, Journal of Applied Statistics, 32 (6), 543-554, 2005.
 
[8]  Peduzzi, P., Concato, J., Kemper, E., Holford, T. R., and Feinstein, A. R., “A simulation study of the number of events per variable in logistic regression analysis”, Journal of clinical epidemiology, 49 (12), 1373-1379, 1996.
 
[9]  Schaefer, R. L., “Alternative estimators in logistic regression when the data are collinear”, Journal of Statistical Computation and Simulation, 25 (1-2), 75-91, 1986.
 
[10]  Albert, A. and Anderson, J.A., “On the existence of maximum likelihood estimates in logistic regression models”, Biometrika, 71 (1), 1-10, 1984.