Evaluating the predictive accuracy of some regression models and artificial neural networks in streamflow forecasting (a case study of the Kaduna River, Northwest Nigeria)

Lawal Mamudu; Ali Aldrees; Salisu Dan’azumi; Alhassan Yahaya

doi:10.1007/s40808-025-02296-0

Evaluating the predictive accuracy of some regression models and artificial neural networks in streamflow forecasting (a case study of the Kaduna River, Northwest Nigeria)

Lawal Mamudu, Ali Aldrees, Salisu Dan’azumi, Alhassan Yahaya

Civil Engineering

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

The study utilizes 30 years of monthly average discharge data from the Nigeria Hydrological Service Agency and meteorological data (temperature, rainfall, and evaporation) from the Nigeria Meteorological Agency, spanning from 1988 to 2017 to compare the predictive accuracy of SARIMA, MLR, and ANN models. The SARIMA model was constructed using only discharge data, while the MLR and ANN models were constructed using the discharge, temperature, rainfall, and evaporation data alongside their lagged copies. The dataset was divided into a training set (January 1988 to December 2008) and a validation set (January 2009 to December 2017). The optimal SARIMA model was selected based on the lowest AIC, BIC, and MSE values, and the highest R² value. The MLR model parameters were estimated using the Ordinary Least Squares (OLS) method and the ANN model was trained using the Sequential API from TensorFlow’s Keras library, with features standardized to improve convergence. Three ANN model structures were experimented with, each varying in the number of neurons and layers. The models were evaluated using statistical metrics: MSE, Coefficient of Determination (R²), Nash–Sutcliffe Efficiency (NSE), and specialized NSE metrics for high and low flow conditions. The results indicated that the MLR Model (Log Transformed Series) performed best overall, with the lowest Validation MSE (0.0985), highest Validation R² (0.8219), and highest Validation NSE (0.8219). However, it struggled with extreme flow conditions, particularly in predicting high and low-flow scenarios. The ANN Model demonstrated balanced performance across different flow conditions, excelling in the Validation NSE for High Flow (0.4702) and Low Flow (− 0.6082) categories. The SARIMA Model performed the least well overall, with the highest Validation MSE (2.0948) and the lowest Validation R² (0.7445) and NSE (0.7445) values. The study concluded that while the MLR Model (Log Transformed Series) shows the best overall performance in terms of accuracy and fit, the ANN Model demonstrates a more balanced performance across different flow conditions.

Original language	English
Article number	125
Journal	Modeling Earth Systems and Environment
Volume	11
Issue number	2
DOIs	https://doi.org/10.1007/s40808-025-02296-0
State	Published - Apr 2025

Keywords

Artificial neural network
Kaduna river
Multiple linear regression
SARIMA
Streamflow

Access to Document

10.1007/s40808-025-02296-0

Cite this

@article{bc804c9c7c614d84a7140dcd4bcf69e5,

title = "Evaluating the predictive accuracy of some regression models and artificial neural networks in streamflow forecasting (a case study of the Kaduna River, Northwest Nigeria)",

abstract = "The study utilizes 30 years of monthly average discharge data from the Nigeria Hydrological Service Agency and meteorological data (temperature, rainfall, and evaporation) from the Nigeria Meteorological Agency, spanning from 1988 to 2017 to compare the predictive accuracy of SARIMA, MLR, and ANN models. The SARIMA model was constructed using only discharge data, while the MLR and ANN models were constructed using the discharge, temperature, rainfall, and evaporation data alongside their lagged copies. The dataset was divided into a training set (January 1988 to December 2008) and a validation set (January 2009 to December 2017). The optimal SARIMA model was selected based on the lowest AIC, BIC, and MSE values, and the highest R2 value. The MLR model parameters were estimated using the Ordinary Least Squares (OLS) method and the ANN model was trained using the Sequential API from TensorFlow{\textquoteright}s Keras library, with features standardized to improve convergence. Three ANN model structures were experimented with, each varying in the number of neurons and layers. The models were evaluated using statistical metrics: MSE, Coefficient of Determination (R2), Nash–Sutcliffe Efficiency (NSE), and specialized NSE metrics for high and low flow conditions. The results indicated that the MLR Model (Log Transformed Series) performed best overall, with the lowest Validation MSE (0.0985), highest Validation R2 (0.8219), and highest Validation NSE (0.8219). However, it struggled with extreme flow conditions, particularly in predicting high and low-flow scenarios. The ANN Model demonstrated balanced performance across different flow conditions, excelling in the Validation NSE for High Flow (0.4702) and Low Flow (− 0.6082) categories. The SARIMA Model performed the least well overall, with the highest Validation MSE (2.0948) and the lowest Validation R2 (0.7445) and NSE (0.7445) values. The study concluded that while the MLR Model (Log Transformed Series) shows the best overall performance in terms of accuracy and fit, the ANN Model demonstrates a more balanced performance across different flow conditions.",

keywords = "Artificial neural network, Kaduna river, Multiple linear regression, SARIMA, Streamflow",

author = "Lawal Mamudu and Ali Aldrees and Salisu Dan{\textquoteright}azumi and Alhassan Yahaya",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive licence to Springer Nature Switzerland AG 2025.",

year = "2025",

month = apr,

doi = "10.1007/s40808-025-02296-0",

language = "English",

volume = "11",

journal = "Modeling Earth Systems and Environment",

issn = "2363-6203",

publisher = "Springer International Publishing AG",

number = "2",

}

Evaluating the predictive accuracy of some regression models and artificial neural networks in streamflow forecasting (a case study of the Kaduna River, Northwest Nigeria). / Mamudu, Lawal; Aldrees, Ali ; Dan’azumi, Salisu et al.
In: Modeling Earth Systems and Environment, Vol. 11, No. 2, 125, 04.2025.

Research output: Contribution to journal › Article › peer-review

TY - JOUR

T1 - Evaluating the predictive accuracy of some regression models and artificial neural networks in streamflow forecasting (a case study of the Kaduna River, Northwest Nigeria)

AU - Mamudu, Lawal

AU - Aldrees, Ali

AU - Dan’azumi, Salisu

AU - Yahaya, Alhassan

N1 - Publisher Copyright: © The Author(s), under exclusive licence to Springer Nature Switzerland AG 2025.

PY - 2025/4

Y1 - 2025/4

N2 - The study utilizes 30 years of monthly average discharge data from the Nigeria Hydrological Service Agency and meteorological data (temperature, rainfall, and evaporation) from the Nigeria Meteorological Agency, spanning from 1988 to 2017 to compare the predictive accuracy of SARIMA, MLR, and ANN models. The SARIMA model was constructed using only discharge data, while the MLR and ANN models were constructed using the discharge, temperature, rainfall, and evaporation data alongside their lagged copies. The dataset was divided into a training set (January 1988 to December 2008) and a validation set (January 2009 to December 2017). The optimal SARIMA model was selected based on the lowest AIC, BIC, and MSE values, and the highest R2 value. The MLR model parameters were estimated using the Ordinary Least Squares (OLS) method and the ANN model was trained using the Sequential API from TensorFlow’s Keras library, with features standardized to improve convergence. Three ANN model structures were experimented with, each varying in the number of neurons and layers. The models were evaluated using statistical metrics: MSE, Coefficient of Determination (R2), Nash–Sutcliffe Efficiency (NSE), and specialized NSE metrics for high and low flow conditions. The results indicated that the MLR Model (Log Transformed Series) performed best overall, with the lowest Validation MSE (0.0985), highest Validation R2 (0.8219), and highest Validation NSE (0.8219). However, it struggled with extreme flow conditions, particularly in predicting high and low-flow scenarios. The ANN Model demonstrated balanced performance across different flow conditions, excelling in the Validation NSE for High Flow (0.4702) and Low Flow (− 0.6082) categories. The SARIMA Model performed the least well overall, with the highest Validation MSE (2.0948) and the lowest Validation R2 (0.7445) and NSE (0.7445) values. The study concluded that while the MLR Model (Log Transformed Series) shows the best overall performance in terms of accuracy and fit, the ANN Model demonstrates a more balanced performance across different flow conditions.

AB - The study utilizes 30 years of monthly average discharge data from the Nigeria Hydrological Service Agency and meteorological data (temperature, rainfall, and evaporation) from the Nigeria Meteorological Agency, spanning from 1988 to 2017 to compare the predictive accuracy of SARIMA, MLR, and ANN models. The SARIMA model was constructed using only discharge data, while the MLR and ANN models were constructed using the discharge, temperature, rainfall, and evaporation data alongside their lagged copies. The dataset was divided into a training set (January 1988 to December 2008) and a validation set (January 2009 to December 2017). The optimal SARIMA model was selected based on the lowest AIC, BIC, and MSE values, and the highest R2 value. The MLR model parameters were estimated using the Ordinary Least Squares (OLS) method and the ANN model was trained using the Sequential API from TensorFlow’s Keras library, with features standardized to improve convergence. Three ANN model structures were experimented with, each varying in the number of neurons and layers. The models were evaluated using statistical metrics: MSE, Coefficient of Determination (R2), Nash–Sutcliffe Efficiency (NSE), and specialized NSE metrics for high and low flow conditions. The results indicated that the MLR Model (Log Transformed Series) performed best overall, with the lowest Validation MSE (0.0985), highest Validation R2 (0.8219), and highest Validation NSE (0.8219). However, it struggled with extreme flow conditions, particularly in predicting high and low-flow scenarios. The ANN Model demonstrated balanced performance across different flow conditions, excelling in the Validation NSE for High Flow (0.4702) and Low Flow (− 0.6082) categories. The SARIMA Model performed the least well overall, with the highest Validation MSE (2.0948) and the lowest Validation R2 (0.7445) and NSE (0.7445) values. The study concluded that while the MLR Model (Log Transformed Series) shows the best overall performance in terms of accuracy and fit, the ANN Model demonstrates a more balanced performance across different flow conditions.

KW - Artificial neural network

KW - Kaduna river

KW - Multiple linear regression

KW - SARIMA

KW - Streamflow

UR - https://www.scopus.com/pages/publications/85218252511

U2 - 10.1007/s40808-025-02296-0

DO - 10.1007/s40808-025-02296-0

M3 - Article

AN - SCOPUS:85218252511

SN - 2363-6203

VL - 11

JO - Modeling Earth Systems and Environment

JF - Modeling Earth Systems and Environment

IS - 2

M1 - 125

ER -

Evaluating the predictive accuracy of some regression models and artificial neural networks in streamflow forecasting (a case study of the Kaduna River, Northwest Nigeria)

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this