Article citationsMore >>

Hina, S., Shaikh, A., & AbulSattar, A. (2017). Analyzing Diabetes Datasets using Data Mining. Journal of Basic & Applied Sciences, 13, 466-471.

has been cited by the following article:

Article

A Review of Diabetes Datasets

1Department of Computer Science, Shehu Shagari College of Education, Sokoto State, Nigeria

2Department of Computer Science, Kebbi State University of Science and Technology, Aliero, Nigeria

3Department Computer Science Federal University Dutse, Jigawa State, Nigeria


Journal of Computer Sciences and Applications. 2022, Vol. 10 No. 1, 6-15
DOI: 10.12691/jcsa-10-1-2
Copyright © 2022 Science and Education Publishing

Cite this paper:
Muhammad Mika’ilu Yabo, Ahamed Baita Garko, Abubakar Atiku Muslim, Hassan Umar Suru. A Review of Diabetes Datasets. Journal of Computer Sciences and Applications. 2022; 10(1):6-15. doi: 10.12691/jcsa-10-1-2.

Correspondence to: Muhammad  Mika’ilu Yabo, Department of Computer Science, Shehu Shagari College of Education, Sokoto State, Nigeria. Email: MYMIKAILU@GMAIL.COM

Abstract

Many intelligent healthcare systems have been developed to diagnose human diseases such as breast cancer, hepatitis, diabetes and heart diseases. Diabetes is a lifelong chronic disease that occurs when the pancreas does not produce enough insulin (Type I diabetes mellitus), or when the body's produced insulin is unable to be utilised properly (Type II diabetes mellitus), Researches that are carried out on diabetes using data mining techniques were done to predict type II diabetes mellitus using different diabetes datasets by different researchers; Pima Indians Diabetes Dataset (PIDD) is used by the majority of the researchers. The dataset (PIDD) has eight (8) attributes which limits more exploration in the field of Machine Learning (ML) for diabetes prediction. Diabetes prediction is limited because of the few attributes available in the diabetes datasets used, and these attributes play important roles in predicting diabetes mellitus types, classes and risk factors whenever a diabetes patient is diagnosed. This paper provides a systematic review of diabetes mellitus datasets, identifying the strength and weakness of the 8 attributes described in the PIDD, which is used by the most of the researchers. Furthermore, this paper has identified the need of the potential researchers in the research community to address the gap by enhancing the existing diabetes dataset attributes with additional attributes, identify the attributes required for the prediction of glucose level, diabetes Types, diabetes classes, diabetes risk factors and to develop a Model that can be used for the prediction.

Keywords