American Journal of Applied Mathematics and Statistics
ISSN (Print): 2328-7306 ISSN (Online): 2328-7292 Website: Editor-in-chief: Mohamed Seddeek
Open Access
Journal Browser
American Journal of Applied Mathematics and Statistics. 2018, 6(6), 266-271
DOI: 10.12691/ajams-6-6-8
Open AccessArticle

Comparison of Accuracy of Support Vector Machine Model and Logistic Regression Model in Predicting Individual Loan Defaults

Obare DM1, and Muraya MM1

1Physical Sciences Department, P.O.Box 109, Chuka University, Chuka, Nairobi, Kenya

Pub. Date: December 14, 2018

Cite this paper:
Obare DM and Muraya MM. Comparison of Accuracy of Support Vector Machine Model and Logistic Regression Model in Predicting Individual Loan Defaults. American Journal of Applied Mathematics and Statistics. 2018; 6(6):266-271. doi: 10.12691/ajams-6-6-8


Prediction of loan defaults is critical to financial institutions in order to minimize losses from loan non-payments. Some of the models that have been used to predict loan default include logistic regression models, linear discriminant analysis models and extreme value theory models. These models are parametric in nature thus they assume that the response being investigated takes a particular functional form. However, there is a possibility that the functional form used to estimate the response is very different from the actual functional form of the response. In such a case, the resulting model will be inaccurate. Support vector machine is non-parametric and does not take any prior assumption of the functional form of the data. The purpose of this study was to compare prediction of individual loan defaults in Kenya using support vector machine and logistic regression models. The data was obtained from equity bank for the period between 2006 and 2016. A sample of 1000 loan applicants whose loans had been approved was used. The variables considered were credit history, purpose of the loan, loan amount, saving account status, employment status, gender, age, security and area of residence. The data was split into training and test data. The train data was used to train the logistic regression and support vector machine models. The study fitted logistic regression and support vector machine models. Logistic regression model showed an accuracy of 0.7727 with the train data and 0.7333 with test data. The logistic regression model showed precision of 0.8440 and 0.8244 with the train and test data. The SVM (linear kernel) model showed an accuracy of 0.8829 and 0.8612 with the train and test respectively. The SVM (linear kernel) showed a precision of 0.8785 with the train data and 0.7831 with the test data. The results showed that support vector machine model performed better than logistic regression model. The study recommended the use of support vector machines in loan default prediction in financial institutions.

loan defaults prediction model logistic regression model support vector machine model

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit


[1]  Hoque, Z. (2005). Linking Environmental Uncertainty to Non-Financial Performance. The British Accounting Review. Britain.
[2]  Evusa, Z., Mudaki, J. S., & Ojala, D. O. (2015). Evaluation of the Factors Leading to Loan Default at Equity Bank, Kenya. Journal of Economics and Sustainability. ISSN 2224-607X. pp. Vol.6, No.8.2016.
[3]  Lahsana, A., Anion, R. & Wah, T. (2010). “Credit Scoring Models using soft Computing Methods: a survey”. International Arab Journal of Information Technology, 7(2), 115-123.
[4]  Bekhet, H. & Eletter, S. (2014). “Credit Risk Management for the Jordanian Commercial Banks: Neural Scoring Approach”. Review of Development Finance, 4, 20-28.
[5]  Chen, Y., Cheng, C. (2013). “Hybrid Models based on Rough Set Classifiers for Setting Credit Rating Decision Rules in the Global Banking Industry”. Knowledge- Based Systems, 39(1), 224-239.
[6]  Banasik, J., Crook, J. N. and Thomas, L.C. (1999). Not If but When Will Borrowers Default. Journal of the Operational Research Society, 50(12) pp. 1185-1190.
[7]  Stepanova, M., & Thomas, L. (2002). Survival Analysis Methods for Personal Loan Data. Operations Research, 50(2), pp.277-289.
[8]  Tong, E. N., Mues C. & Thomas, L. (2012). Mixture Cure Models in Credit Scoring: If and When Borrowers Default. European Journal of Operational Research, 218(1) pp. 132-139.
[9]  Khandani, A.E., Kim, A.J. & Andrew W. Lo. (2010). Consumer Credit-Risk Models via Machine-Learning Algorithms. Journal of Banking Finance 34:2767-87.
[10]  Galinndo, J., & Pablo T. (2000). Credit Risk Assessment using Statistical and Machine Learning: Basic Methodology and Risk and Risk Modelling Applications. Computational Economics 15: 107-43.
[11]  Butaru F., Qingqing C., Brian C., Sanmay D., Andrew W. Lo. & Akhtar S. (2016). Risk and risk Management in the Credit Card Industry. Journal of Banking and Finance 72:218-39.
[12]  Divino, J. A., Lima, E. S., & Orrillo, J. (2013). Interest Rates and Default in Unsecured Loan Markets. Quantitative Finance, 13(12), 1925-1934.
[13]  Martin, A., Travis L. & Venkatasamy P. (2010). A Framework to Develop Qualitative Bankruptcy Prediction Rules. St. Joseph’s Journal of Humanities and Science 1:73-83.
[14]  Agbemava, E., Nyarko, I. K., Adade, T. C., & Bediako, A. K. (2016). Logistic Regression Analysis of Predictors of Loan Defaults by Customers of Non- Traditional Banks in Ghana. African Journal of Business Management 10(2), 33-43.
[15]  Calabrese, R. & Osmetti, S. A. (2013). Modelling Small and Medium Enterprise Loan Defaults as Rare Events: The Generalized Extreme Value Regression Model. Journal of Applied Statistics, 40(6), 1172-1188.
[16]  Zhou, L., Lai K.K., & Yu. L. (2010). Least Squares Support Vector Machines Ensemble Models for Credit Scoring. Expert Systems with Applications 37: 127-133.
[17]  Sebe, V., Razvan, A. (2009). Estimating Probabilities of Default using Support Vector Machines. A master Thesis Presented at centre of Applied Statistics and Economics. Humbolt University, Berlin.
[18]  Huang, C. L., Chen, M. C., & Wang, C. J. (2007). Credit Scoring with a Data Mining Approach based on Support Vector Machines. Expert systems with Application. 33 (2007). 847-856.
[19]  James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning (Vol. 112). New York: Springer.
[20]  J. Pinheiro, D., Bates, S DebRoy, D., Sarkar, R., C Team R Package Version 3 (57), 1-89.
[21]  Tashakkori, A., & Teddie, C. (2003), The Handbook of Mixed Methods in Social and Behavioural Research, Sage, Thousand Oaks, CA.
[22]  Mugenda, A. & Mugenda, O. (1999). Research Methods-Quantitative and Qualitative Approaches, Nairobi. Act Press.
[23]  Ameyaw-Amankwah,I. (2011). Causes and Effects of Loan Defaults on the Profitability of Okomfo Anokye Rural Bank. Master Thesis KNUST, Accra, Ghana.
[24]  Muller K-R., Mika S., (2001). An Introduction to Kernel-based Learning Algorithms, IEEE Transactions on Neural Networks 12(2), 181-201.