Journal of Computer Sciences and Applications
ISSN (Print): 2328-7268 ISSN (Online): 2328-725X Website: https://www.sciepub.com/journal/jcsa Editor-in-chief: Minhua Ma, Patricia Goncalves
Open Access
Journal Browser
Go
Journal of Computer Sciences and Applications. 2022, 10(1), 6-15
DOI: 10.12691/jcsa-10-1-2
Open AccessArticle

A Review of Diabetes Datasets

Muhammad Mika’ilu Yabo1, , Ahamed Baita Garko2, 3, Abubakar Atiku Muslim2 and Hassan Umar Suru2

1Department of Computer Science, Shehu Shagari College of Education, Sokoto State, Nigeria

2Department of Computer Science, Kebbi State University of Science and Technology, Aliero, Nigeria

3Department Computer Science Federal University Dutse, Jigawa State, Nigeria

Pub. Date: October 16, 2022

Cite this paper:
Muhammad Mika’ilu Yabo, Ahamed Baita Garko, Abubakar Atiku Muslim and Hassan Umar Suru. A Review of Diabetes Datasets. Journal of Computer Sciences and Applications. 2022; 10(1):6-15. doi: 10.12691/jcsa-10-1-2

Abstract

Many intelligent healthcare systems have been developed to diagnose human diseases such as breast cancer, hepatitis, diabetes and heart diseases. Diabetes is a lifelong chronic disease that occurs when the pancreas does not produce enough insulin (Type I diabetes mellitus), or when the body's produced insulin is unable to be utilised properly (Type II diabetes mellitus), Researches that are carried out on diabetes using data mining techniques were done to predict type II diabetes mellitus using different diabetes datasets by different researchers; Pima Indians Diabetes Dataset (PIDD) is used by the majority of the researchers. The dataset (PIDD) has eight (8) attributes which limits more exploration in the field of Machine Learning (ML) for diabetes prediction. Diabetes prediction is limited because of the few attributes available in the diabetes datasets used, and these attributes play important roles in predicting diabetes mellitus types, classes and risk factors whenever a diabetes patient is diagnosed. This paper provides a systematic review of diabetes mellitus datasets, identifying the strength and weakness of the 8 attributes described in the PIDD, which is used by the most of the researchers. Furthermore, this paper has identified the need of the potential researchers in the research community to address the gap by enhancing the existing diabetes dataset attributes with additional attributes, identify the attributes required for the prediction of glucose level, diabetes Types, diabetes classes, diabetes risk factors and to develop a Model that can be used for the prediction.

Keywords:
data mining technique healthcare systems diabetes mellitus datasets diabetes dataset attributes

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

References:

[1]  DelVecchio, A. (2019). health informatics https://searchhealthit.techtarget.com/definition/health-informatics.
 
[2]  Azhar, F. (2020). Data Mining in Healthcare: Benefits, Techniques, and Prospects https://www.way2smile.ae/blog/data-mining-in-healthcare/.
 
[3]  Chaves, L. & Marques, G. (2021) Data Mining Techniques for Early Diagnosis of Diabetes: A Comparative Study.
 
[4]  Yusuf, A. B., Dima, R. M., & Aina, S. K. (2021). Optimized Breast Cancer Classification using Feature Selection and Outliers Detection. Journal of the Nigerian Society of Physical Sciences, 298-307.
 
[5]  Hina, S., Shaikh, A., & AbulSattar, A. (2017). Analyzing Diabetes Datasets using Data Mining. Journal of Basic & Applied Sciences, 13, 466-471.
 
[6]  Peker, M., Özkaraca, O., & Şaşar, A. (2018). Use of Orange Data Mining Toolbox for Data Analysis in Clinical Decision Making: The Diagnosis of Diabetes Disease.
 
[7]  World Health Organization (2021) Diabetes. https://www.who.int/news-room/fact-sheets/detail/diabetes.
 
[8]  Saeedi, P.; Petersohn, I.; Salpea, P.; Malanda, B.; Karuranga, S.; Unwin, N.; Colagiuri, S.; Guariguata, L.; Motala, A.A.; & Ogurtsova, K. (2019) Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9th edition. Diabetes Res. Clin. Pract.
 
[9]  Khanam, J.J. & Foo, S.Y. (2021) A comparison of machine learning algorithms for diabetes prediction, ICT Express.
 
[10]  Manimaran, R., & Vanitha, M. (2017) Prediction of Diabetes Disease Using Classification Data Mining Techniques. International Journal of Engineering and Technology, https://www.researchgate.net/publication/331672855
 
[11]  Alshammari1, R., Atiyah, N., Daghistani, T., & Alshammari, A. (2020) Improving Accuracy for Diabetes Mellitus Prediction by Using Deepnet. Public Health Informatics * ISSN 1947-2579 * http://ojphi.org * 12(1):e11.
 
[12]  Breault, J. L. (2011). “Data Mining Diabetic Databases: Are Rough Sets a Useful Addition?
 
[13]  Parthiban, G., Rajesh, A., & Srivatsa, S.K. (2011). “Diagnosis of Heart Disease for Diabetic Patients using Naive Bayes Method”, International Journal of Computer Applications, 24(3).
 
[14]  Padmaja, P. (2008) “Characteristic evaluation of diabetes data using clustering techniques”, IJCSNS International Journal of Computer Science and Network Security, 8(11).
 
[15]  Rajesh, K. & Sangeetha, V. (2012). Application of Data Mining Methods and Techniques for Diabetes Diagnosis. International Journal of Engineering and Innovative Technology (IJEIT), 2(3).
 
[16]  Rahim, S.S. (2016). Automatic Screening and Classification of Diabetic Retinopathy Eye Fundus Images. Unpublished PhD Thesis. Coventry: Coventry University.
 
[17]  Neilesh, B. & Gandhi, K. (2014) Diabetes prediction using feature selection and classification. Int. J. Adv. Eng. Res. Dev.
 
[18]  Vijayan, V. & Anjali, C. (2015) Prediction and Diagnosis of Diabetes Mellitus - A Machine Learning Approach. IEEE.
 
[19]  Miss, S.J., & Megha, B. (2016) detection and prediction of diabetes mellitus using back-propagation neural network. IEEE.
 
[20]  Mohebbi, A., Tinna, A.B., Alexander, J.R., Henrik, B., Marco, F., & Morten, M. (2017). A deep learning approach to adherence detection for type 2 diabetics. IEEE.
 
[21]  Francesco, M., Nardone, V., & Santone, A. (2017) Diabetes mellitus affected patients classification and diagnosis through machine learning techniques. Sci. Direct;112:2519-28.
 
[22]  Maham, J., Hammad, A., Mehreen, A., Khawar, K., Raheel, N. (2017) An expert system for diabetes prediction using auto-tuned multi-layer perceptron. In: IEEE, vol. 2017 intelligent systems Conference (IntelliSys). London: IEEE.
 
[23]  Wenqian, C., Shuyu, C., Hancui, Z., Tianshu, W. (2017) A hybrid prediction model for type 2 diabetes using K-means and decision tree. In: 8th IEEE Int. Conf. Softw. Eng. Serv. Sci. ICSESS Beijing IEEE.
 
[24]  Mangrulkar R.S. (2017) Retinal image classification technique for diabetes identification. Int. Conf. Comput. Methodol. Commun. ICCMC Erode IEEE.
 
[25]  Sidong, W., Xuejiao, Z., & Chunyan, M. (2018) A comprehensive exploration to the machine learning techniques for diabetes identification. IEEE 4th world forum internet of things WF-IoT IEEE.
 
[26]  Ashiquzzaman, A. (2018) Reduction of overfitting in diabetes prediction using deep learning neural network. IT Converge. Secure. 2017 Lect. Notes Electr. Eng, vol. 449. Springer Singap.
 
[27]  Deepti, S., & Dilip, S.S. (2018) Prediction of diabetes using classification algorithms. Sci. Direct.
 
[28]  Han, W., Shengqi, Y., Zhangqin, H., Jian, H., & Xiaoyi, W. (2018) Type 2 diabetes mellitus prediction model based on data mining. Sci. Direct.
 
[29]  Safial, I.A., & Islam M. (2019) Diabetes prediction: a deep learning approach. Int. J. Inf. Eng. Electron. Bus, vol. 11.
 
[30]  Ayon, S.I & Islam, M. (2019) “Diabetes Prediction: A Deep Learning Approach", International Journal of Information Engineering and Electronic Business (IJIEEB), Vol.11, No.2.
 
[31]  Naz, H., & Ahuja, S. (2020) Deep learning approach for diabetes prediction using PIMA Indian dataset. Journal of Diabetes & Metabolic Disorders.
 
[32]  Bhoia, S.K, Pandab, S.K., Jenaa, K.K., Abhisekhc, P.A., Sahood, K.S., Samae, N.U., Pradhan, S.S., & Sahooa, R.R. (2021) Prediction of Diabetes in Females of Pima Indian Heritage: A Complete Supervised Learning Approach. Turkish Journal of Computer and Mathematics Education. Vol.12 No.10 3074-3084.
 
[33]  Islam, M., Rahman, J., Roy , D.C., Maniruzzaman, M. (2020) Automated detection and classification of diabetes disease based on Bangladesh demographic and health survey data, 2011 using machine learning approach. Diabetes and Metabolic Syndrome Clinical Research and Reviews https://www.researchgate.net/publication/339846671.
 
[34]  Alpan, K., & İlgi, G.S. (2020) Classification of Diabetes Dataset with Data Mining Techniques by Using WEKA Approach. 978-1-7281-9090-7, IEEE.
 
[35]  Anwar, F., & Ul-Ain, Q., & Ejaz, M., & Mosavi, A. (2020). A comparative analysis on diagnosis of diabetes mellitus using different approaches -A survey. Informatics in Medicine Unlocked. 21. 100482.