TY - JOUR
T1 - Predictive modeling approach for surface water quality
T2 - Development and comparison of machine learning models
AU - Izhar Shah, Muhammad
AU - Alaloul, Wesam Salah
AU - Ali Alqahtani, Abdulaziz
AU - Aldrees, Ali
AU - Ali Musarat, Muhammad
AU - Javed, Muhammad Faisal
N1 - Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.
PY - 2021/7/2
Y1 - 2021/7/2
N2 - Water pollution is an increasing global issue that societies are facing and is threating human health, ecosystem functions and agriculture production. The distinguished features of artificial intelligence (AI) based modeling can deliver a deep insight pertaining to rising water quality concerns. The current study investigates the predictive performance of gene expression programming (GEP), artificial neural network (ANN) and linear regression model (LRM) for modeling monthly total dissolved solids (TDS) and specific conductivity (EC) in the upper Indus River at two outlet stations. In total, 30 years of historical water quality data, comprising 360 TDS and EC monthly records, were used for models training and testing. Based on a significant correlation, the TDS and EC modeling were correlated with seven input parameters. Results were evaluated using various performance measure indicators, error assessment and external criteria. The simulated outcome of the models indicated a strong association with actual data where the correlation coefficient above 0.9 was observed for both TDS and EC. Both the GEP and ANN models remained the reliable techniques in predicting TDS and EC. The formulated GEP mathematical equations depict its novelty as compared to ANN and LRM. The results of sensitivity analysis indicated the increasing trend of input variables affecting TDS as HCO3− (22.33%) > Cl− (21.66%) > Mg2+ (16.98%) > Na+ (14.55%) > Ca2+ (12.92%) > SO4 2− (11.55%) > pH (0%), while, in the case of EC, it followed the trend as HCO3− (42.36%) > SO4 2−(25.63%) > Ca2+ (13.59%) > Cl− (12.8%) > Na+ (5.01%) > pH (0.61%) > Mg2+ (0%). The parametric analysis revealed that models have incorporated the effect of all the input parameters in the modeling process. The external assessment criteria confirmed the generalized outcome and robustness of the proposed approaches. Conclusively, the outcomes of this study demonstrated that the formulation of AI based models are cost effective and helpful for river water quality assessment, management and policy making.
AB - Water pollution is an increasing global issue that societies are facing and is threating human health, ecosystem functions and agriculture production. The distinguished features of artificial intelligence (AI) based modeling can deliver a deep insight pertaining to rising water quality concerns. The current study investigates the predictive performance of gene expression programming (GEP), artificial neural network (ANN) and linear regression model (LRM) for modeling monthly total dissolved solids (TDS) and specific conductivity (EC) in the upper Indus River at two outlet stations. In total, 30 years of historical water quality data, comprising 360 TDS and EC monthly records, were used for models training and testing. Based on a significant correlation, the TDS and EC modeling were correlated with seven input parameters. Results were evaluated using various performance measure indicators, error assessment and external criteria. The simulated outcome of the models indicated a strong association with actual data where the correlation coefficient above 0.9 was observed for both TDS and EC. Both the GEP and ANN models remained the reliable techniques in predicting TDS and EC. The formulated GEP mathematical equations depict its novelty as compared to ANN and LRM. The results of sensitivity analysis indicated the increasing trend of input variables affecting TDS as HCO3− (22.33%) > Cl− (21.66%) > Mg2+ (16.98%) > Na+ (14.55%) > Ca2+ (12.92%) > SO4 2− (11.55%) > pH (0%), while, in the case of EC, it followed the trend as HCO3− (42.36%) > SO4 2−(25.63%) > Ca2+ (13.59%) > Cl− (12.8%) > Na+ (5.01%) > pH (0.61%) > Mg2+ (0%). The parametric analysis revealed that models have incorporated the effect of all the input parameters in the modeling process. The external assessment criteria confirmed the generalized outcome and robustness of the proposed approaches. Conclusively, the outcomes of this study demonstrated that the formulation of AI based models are cost effective and helpful for river water quality assessment, management and policy making.
KW - External validation
KW - Parametric study
KW - Regression analysis
KW - River water quality
KW - Soft computing
KW - Specific conductivity
KW - Sustainable environment
KW - Total dissolved solids
KW - Variable importance
UR - http://www.scopus.com/inward/record.url?scp=85109924142&partnerID=8YFLogxK
U2 - 10.3390/su13147515
DO - 10.3390/su13147515
M3 - Article
AN - SCOPUS:85109924142
SN - 2071-1050
VL - 13
JO - Sustainability (Switzerland)
JF - Sustainability (Switzerland)
IS - 14
M1 - 7515
ER -