TY - JOUR
T1 - Evaluation of water quality indexes with novel machine learning and SHapley Additive ExPlanation (SHAP) approaches
AU - Aldrees, Ali
AU - Khan, Majid
AU - Taha, Abubakr Taha Bakheit
AU - Ali, Mujahid
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/2
Y1 - 2024/2
N2 - Water quality indexes (WQI) are pivotal in assessing aquatic systems. Conventional modeling approaches rely on extensive datasets with numerous unspecified inputs, leading to time-consuming WQI assessment procedures. Numerous studies have used machine learning (ML) methods for WQI analysis but often lack model interpretability. To address this issue, this study developed five interpretable predictive models, including two gene expression programming (GEP) models, two deep neural networks (DNN) models, and one optimizable Gaussian process regressor (OGPR) model for estimating electrical conductivity (EC) and total dissolved solids (TDS). For the model development, a total of 372 records on a monthly basis were collected in the Upper Indus River at two outlet stations. The efficacy and accuracy of the models were assessed using various statistical measures, such as correlation (R), mean square error (MAE), root mean square error (RMSE), and 5-fold cross-validation. The DNN2 model demonstrated outstanding performance compared to the other five models, exhibiting R-values closer to 1.0 for both EC and TDS. However, the genetic programming-based models, GEP1 and GEP2, exhibited comparatively lower accuracy in predicting the water quality indexes. The SHapely Additive exPlanation (SHAP) analysis revealed that bicarbonate, calcium, and sulphate jointly contribute approximately 78 % to EC, while the combined presence of sodium, bicarbonate, calcium, and magnesium accounts for around 87 % of TDS in water. Notably, the influence of pH and chloride was minimal on both water quality indexes. In conclusion, the study highlights the cost-effective and practical potential of predictive models for EC and TDS in assessing and monitoring river water quality.
AB - Water quality indexes (WQI) are pivotal in assessing aquatic systems. Conventional modeling approaches rely on extensive datasets with numerous unspecified inputs, leading to time-consuming WQI assessment procedures. Numerous studies have used machine learning (ML) methods for WQI analysis but often lack model interpretability. To address this issue, this study developed five interpretable predictive models, including two gene expression programming (GEP) models, two deep neural networks (DNN) models, and one optimizable Gaussian process regressor (OGPR) model for estimating electrical conductivity (EC) and total dissolved solids (TDS). For the model development, a total of 372 records on a monthly basis were collected in the Upper Indus River at two outlet stations. The efficacy and accuracy of the models were assessed using various statistical measures, such as correlation (R), mean square error (MAE), root mean square error (RMSE), and 5-fold cross-validation. The DNN2 model demonstrated outstanding performance compared to the other five models, exhibiting R-values closer to 1.0 for both EC and TDS. However, the genetic programming-based models, GEP1 and GEP2, exhibited comparatively lower accuracy in predicting the water quality indexes. The SHapely Additive exPlanation (SHAP) analysis revealed that bicarbonate, calcium, and sulphate jointly contribute approximately 78 % to EC, while the combined presence of sodium, bicarbonate, calcium, and magnesium accounts for around 87 % of TDS in water. Notably, the influence of pH and chloride was minimal on both water quality indexes. In conclusion, the study highlights the cost-effective and practical potential of predictive models for EC and TDS in assessing and monitoring river water quality.
KW - Deep neural networks
KW - Gene expression programming
KW - Optimizable Gaussian process regressor
KW - SHAP
KW - Water quality indexes
UR - http://www.scopus.com/inward/record.url?scp=85183555550&partnerID=8YFLogxK
U2 - 10.1016/j.jwpe.2024.104789
DO - 10.1016/j.jwpe.2024.104789
M3 - Article
AN - SCOPUS:85183555550
SN - 2214-7144
VL - 58
JO - Journal of Water Process Engineering
JF - Journal of Water Process Engineering
M1 - 104789
ER -