TY - JOUR
T1 - Evaluating the predictive accuracy of some regression models and artificial neural networks in streamflow forecasting (a case study of the Kaduna River, Northwest Nigeria)
AU - Mamudu, Lawal
AU - Aldrees, Ali
AU - Dan’azumi, Salisu
AU - Yahaya, Alhassan
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2025.
PY - 2025/4
Y1 - 2025/4
N2 - The study utilizes 30 years of monthly average discharge data from the Nigeria Hydrological Service Agency and meteorological data (temperature, rainfall, and evaporation) from the Nigeria Meteorological Agency, spanning from 1988 to 2017 to compare the predictive accuracy of SARIMA, MLR, and ANN models. The SARIMA model was constructed using only discharge data, while the MLR and ANN models were constructed using the discharge, temperature, rainfall, and evaporation data alongside their lagged copies. The dataset was divided into a training set (January 1988 to December 2008) and a validation set (January 2009 to December 2017). The optimal SARIMA model was selected based on the lowest AIC, BIC, and MSE values, and the highest R2 value. The MLR model parameters were estimated using the Ordinary Least Squares (OLS) method and the ANN model was trained using the Sequential API from TensorFlow’s Keras library, with features standardized to improve convergence. Three ANN model structures were experimented with, each varying in the number of neurons and layers. The models were evaluated using statistical metrics: MSE, Coefficient of Determination (R2), Nash–Sutcliffe Efficiency (NSE), and specialized NSE metrics for high and low flow conditions. The results indicated that the MLR Model (Log Transformed Series) performed best overall, with the lowest Validation MSE (0.0985), highest Validation R2 (0.8219), and highest Validation NSE (0.8219). However, it struggled with extreme flow conditions, particularly in predicting high and low-flow scenarios. The ANN Model demonstrated balanced performance across different flow conditions, excelling in the Validation NSE for High Flow (0.4702) and Low Flow (− 0.6082) categories. The SARIMA Model performed the least well overall, with the highest Validation MSE (2.0948) and the lowest Validation R2 (0.7445) and NSE (0.7445) values. The study concluded that while the MLR Model (Log Transformed Series) shows the best overall performance in terms of accuracy and fit, the ANN Model demonstrates a more balanced performance across different flow conditions.
AB - The study utilizes 30 years of monthly average discharge data from the Nigeria Hydrological Service Agency and meteorological data (temperature, rainfall, and evaporation) from the Nigeria Meteorological Agency, spanning from 1988 to 2017 to compare the predictive accuracy of SARIMA, MLR, and ANN models. The SARIMA model was constructed using only discharge data, while the MLR and ANN models were constructed using the discharge, temperature, rainfall, and evaporation data alongside their lagged copies. The dataset was divided into a training set (January 1988 to December 2008) and a validation set (January 2009 to December 2017). The optimal SARIMA model was selected based on the lowest AIC, BIC, and MSE values, and the highest R2 value. The MLR model parameters were estimated using the Ordinary Least Squares (OLS) method and the ANN model was trained using the Sequential API from TensorFlow’s Keras library, with features standardized to improve convergence. Three ANN model structures were experimented with, each varying in the number of neurons and layers. The models were evaluated using statistical metrics: MSE, Coefficient of Determination (R2), Nash–Sutcliffe Efficiency (NSE), and specialized NSE metrics for high and low flow conditions. The results indicated that the MLR Model (Log Transformed Series) performed best overall, with the lowest Validation MSE (0.0985), highest Validation R2 (0.8219), and highest Validation NSE (0.8219). However, it struggled with extreme flow conditions, particularly in predicting high and low-flow scenarios. The ANN Model demonstrated balanced performance across different flow conditions, excelling in the Validation NSE for High Flow (0.4702) and Low Flow (− 0.6082) categories. The SARIMA Model performed the least well overall, with the highest Validation MSE (2.0948) and the lowest Validation R2 (0.7445) and NSE (0.7445) values. The study concluded that while the MLR Model (Log Transformed Series) shows the best overall performance in terms of accuracy and fit, the ANN Model demonstrates a more balanced performance across different flow conditions.
KW - Artificial neural network
KW - Kaduna river
KW - Multiple linear regression
KW - SARIMA
KW - Streamflow
UR - http://www.scopus.com/inward/record.url?scp=85218252511&partnerID=8YFLogxK
U2 - 10.1007/s40808-025-02296-0
DO - 10.1007/s40808-025-02296-0
M3 - Article
AN - SCOPUS:85218252511
SN - 2363-6203
VL - 11
JO - Modeling Earth Systems and Environment
JF - Modeling Earth Systems and Environment
IS - 2
M1 - 125
ER -