Environmental assessment based surface water quality prediction using hyper-parameter optimized machine learning models based on consistent big data

  • Muhammad Izhar Shah
  • , Muhammad Faisal Javed
  • , Abdulaziz Alqahtani
  • , Ali Aldrees

Research output: Contribution to journalArticlepeer-review

78 Scopus citations

Abstract

Prediction of dissolved oxygen (DO) and total dissolved solids (TDS) are of paramount importance for water environmental protection and analysis of the ecosystem. The traditional methods for water quality prediction are suffering from unadjusted hyper-parameters. To effectively solve the hyper-parameter setting problem, the present study proposes a framework for tuning the hyper-parameters of feed forward neural network (FFNN) and gene expression programming (GEP) with particle swarm optimization (PSO). Thereafter, the PSO coupled hybrid feed forward neural network (PSO-FFNN) and hybrid gene expression programming (PSO-GEP) were used to predict DO and TDS levels in the upper Indus River. Based on thirty years consistent dataset, the most influential input parameters for DO and TDS prediction were determined using principal component analysis (PCA). The impact on the model performance was evaluated employing five statistical evaluation techniques. Modeling results indicated excellent searching efficiency of the PSO algorithm in optimizing the structure and hyper-parameters of the FFNN and GEP. Results of PCA revealed that magnesium, chloride, sulphate, bicarbonates, specific conductivity, and water temperature are appropriate inputs for DO modeling, whereas; calcium, magnesium, sodium, chloride, bicarbonates and specific conductivity remained the influential parameters for TDS. Both the proposed hybrid models showed better accuracy in predicting DO and TDS, however, the hybrid PSO-GEP model achieves better accuracy than the PSO-FFNN with R value above 0.85, the root mean squared error (RMSE) below 3 mg/l and performance index value close to 1. The external validation criteria confirmed the resolved overfitting issue and generalized results of the models. Cross-validation of the model output attained the best statistical metrics i.e. (R = 0.87, RMSE = 2.67) and (R = 0.895, RMSE = 2.21) for PSO-FFNN and PSO-GEP model, respectively. The research findings demonstrated that the implementation of artificial intelligence models with optimization routine can lead to optimized models for accurate prediction of water quality.

Original languageEnglish
Pages (from-to)324-340
Number of pages17
JournalProcess Safety and Environmental Protection
Volume151
DOIs
StatePublished - Jul 2021

Keywords

  • Cross-validation
  • Environmental protection
  • Machine learning modeling
  • Particle swarm optimization
  • Principal component analysis
  • River water quality

Fingerprint

Dive into the research topics of 'Environmental assessment based surface water quality prediction using hyper-parameter optimized machine learning models based on consistent big data'. Together they form a unique fingerprint.

Cite this