Article citationsMore >>

Li, P., Karunanidhi, D., Subramani, T., & Srinivasamoorthy, K. (2021). Sources and consequences of groundwater contamination. Archives of environmental contamination and toxicology, 80(1), 1-10.

has been cited by the following article:

Article

Predictive Modelling of Groundwater Quality in the Nakanbé River Basin Using Machine Learning Techniques

1Mining Engineering Department, University Yembila-Abdoulaye-TOGUYENI, BP 54 Fada N' Gourma, Burkina Faso

2Geosciences and Environment Laboratory (LaGE). Joseph KI-ZERBO University, BP 7021, Ouagadougou, Burkina Faso

3Direction de la Qualité des Eaux/Ministère de l'Environnement de l'Eau et de l'Assainissement, Burkina Faso

4Laboratoire Sciences et Technologie (LaST), Université Thomas SANKARA, 12 BP 417 Ouagadougou 12, Burkina Faso


American Journal of Water Resources. 2025, Vol. 13 No. 3, 86-96
DOI: 10.12691/ajwr-13-3-3
Copyright © 2025 Science and Education Publishing

Cite this paper:
Issoufou OUEDRAOGO, W. J. P. SANDWIDI, Fatoumata KABORE, Mahamadou KONARE, Cheick Abdramane OUATTARA. Predictive Modelling of Groundwater Quality in the Nakanbé River Basin Using Machine Learning Techniques. American Journal of Water Resources. 2025; 13(3):86-96. doi: 10.12691/ajwr-13-3-3.

Correspondence to: Issoufou  OUEDRAOGO, Mining Engineering Department, University Yembila-Abdoulaye-TOGUYENI, BP 54 Fada N' Gourma, Burkina Faso. Email: ouedraogo.issoufou03@gmail.com

Abstract

Continuous monitoring of groundwater quality is essential for protecting public health and the environment, particularly in vulnerable regions such as Burkina Faso’s Nakanbé Basin, where groundwater serves as a primary source of potable water. This study aimed to develop and evaluate machine learning (ML) models to predict two key water quality parameters: Total Dissolved Solids (TDS) and Total Alkalinity (TA), using data provided by the General Directorate of Water Resources (DGRE). A total of 1,765 groundwater samples were analyzed, encompassing nineteen physicochemical parameters. Prior to modelling, multicollinearity analysis was conducted to ensure the reliability of the input variables. Three regression algorithms Random Forest Regression (RFR), Multiple Linear Regression (MLR), and Decision Tree Regression (DTR) were compared for their predictive performance. Among them, Random Forest demonstrated the highest accuracy, with the highest R² and lowest error metrics (MAE, RMSE) across both training and testing datasets for both TDS and TA. While MLR offered consistent and interpretable results, particularly for TA, DTR exhibited strong overfitting, with lower generalizability on test data. The results highlight the superiority of ensemble learning approaches, particularly RFR, in capturing complex, nonlinear relationships within groundwater quality datasets. ML application in this context provides a cost-effective and scalable alternative to conventional laboratory-based monitoring methods. It also enables the identification of influential water quality parameters, supports risk assessment of contamination, and contributes to evidence-based water resource management strategies. These findings demonstrate the potential of ML tools to enhance groundwater monitoring and advance sustainable water governance in arid and semi-arid regions.

Keywords