Evaluating the predictive accuracy of some regression models and artificial neural networks in streamflow forecasting (a case study of the Kaduna River, Northwest Nigeria)

Lawal Mamudu, Ali Aldrees, Salisu Dan’azumi, Alhassan Yahaya

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

The study utilizes 30 years of monthly average discharge data from the Nigeria Hydrological Service Agency and meteorological data (temperature, rainfall, and evaporation) from the Nigeria Meteorological Agency, spanning from 1988 to 2017 to compare the predictive accuracy of SARIMA, MLR, and ANN models. The SARIMA model was constructed using only discharge data, while the MLR and ANN models were constructed using the discharge, temperature, rainfall, and evaporation data alongside their lagged copies. The dataset was divided into a training set (January 1988 to December 2008) and a validation set (January 2009 to December 2017). The optimal SARIMA model was selected based on the lowest AIC, BIC, and MSE values, and the highest R2 value. The MLR model parameters were estimated using the Ordinary Least Squares (OLS) method and the ANN model was trained using the Sequential API from TensorFlow’s Keras library, with features standardized to improve convergence. Three ANN model structures were experimented with, each varying in the number of neurons and layers. The models were evaluated using statistical metrics: MSE, Coefficient of Determination (R2), Nash–Sutcliffe Efficiency (NSE), and specialized NSE metrics for high and low flow conditions. The results indicated that the MLR Model (Log Transformed Series) performed best overall, with the lowest Validation MSE (0.0985), highest Validation R2 (0.8219), and highest Validation NSE (0.8219). However, it struggled with extreme flow conditions, particularly in predicting high and low-flow scenarios. The ANN Model demonstrated balanced performance across different flow conditions, excelling in the Validation NSE for High Flow (0.4702) and Low Flow (− 0.6082) categories. The SARIMA Model performed the least well overall, with the highest Validation MSE (2.0948) and the lowest Validation R2 (0.7445) and NSE (0.7445) values. The study concluded that while the MLR Model (Log Transformed Series) shows the best overall performance in terms of accuracy and fit, the ANN Model demonstrates a more balanced performance across different flow conditions.

Original languageEnglish
Article number125
JournalModeling Earth Systems and Environment
Volume11
Issue number2
DOIs
StatePublished - Apr 2025

Keywords

  • Artificial neural network
  • Kaduna river
  • Multiple linear regression
  • SARIMA
  • Streamflow

Fingerprint

Dive into the research topics of 'Evaluating the predictive accuracy of some regression models and artificial neural networks in streamflow forecasting (a case study of the Kaduna River, Northwest Nigeria)'. Together they form a unique fingerprint.

Cite this