1Department of Statistics, Florida State University, Tallahassee, USA
2Department of Mathematics and Statistics, University of Massachusetts Amherst, Amherst, USA
3Department of Statistics, Rice University, Houston, USA
American Journal of Applied Mathematics and Statistics.
2023,
Vol. 11 No. 3, 89-97
DOI: 10.12691/ajams-11-3-2
Copyright © 2023 Science and Education PublishingCite this paper: Yinpu Li, Siqi Mao, Yaping Yuan, Ziren Wang, Yixin Kang, Yuanxin Yao. Beyond Tides and Time: Machine Learning’s Triumph in Water Quality Forecasting.
American Journal of Applied Mathematics and Statistics. 2023; 11(3):89-97. doi: 10.12691/ajams-11-3-2.
Correspondence to: Siqi Mao, Department of Statistics, Florida State University, Tallahassee, USA. Email:
mden17g@gmail.comAbstract
Water resources are essential for sustaining human livelihoods and environmental well-being. Accurate water quality prediction plays a pivotal role in effective resource management and pollution mitigation. In this study, we assess the effectiveness of five distinct predictive models—linear regression, Random Forest, XGBoost, LightGBM, and MLP neural network—in forecasting pH values within the geographical context of Georgia, USA. Notably, LightGBM emerges as the top-performing model, achieving the highest average precision. Our analysis underscores the supremacy of tree-based models in addressing regression challenges, while revealing the sensitivity of MLP neural networks to feature scaling. Intriguingly, our findings shed light on a counter-intuitive discovery: machine learning models, which do not explicitly account for time dependencies and spatial considerations, outperform spatial-temporal models. This unexpected superiority of machine learning models challenges conventional assumptions and highlights their potential for practical applications in water quality prediction. Our research aims to establish a robust predictive pipeline accessible to both data science experts and those without domain-specific knowledge. In essence, we present a novel perspective on achieving high prediction accuracy and interpretability in data science methodologies. Through this study, we redefine the boundaries of water quality forecasting, emphasizing the significance of data-driven approaches over traditional spatial-temporal models. Our findings offer valuable insights into the evolving landscape of water resource management and environmental protection.
Keywords