TY - JOUR
T1 - AI-driven wastewater management through comparative analysis of feature selection techniques and predictive models
AU - Dikmen, Faruk
AU - Demir, Ahmet
AU - Özkaya, Bestami
AU - Raza, Muhammad Owais
AU - Rasheed, Jawad
AU - Asuroglu, Tunc
AU - Alsubai, Shtwai
N1 - Publisher Copyright:
© The Author(s) 2025.
PY - 2025/12
Y1 - 2025/12
N2 - The integration of artificial intelligence (AI) in wastewater treatment management offers a promising approach to optimizing effluent quality predictions and enhancing operational efficiency. This study evaluates the performance of machine learning models in predicting key wastewater effluent parameters Chemical Oxygen Demand (COD), Biochemical Oxygen Demand (BOD), Total Suspended Solids (TSS), Total Effluent Nitrogen and Total Effluent Phosphorus. Three feature selection techniques were applied: SelectKBest, Mutual Information, and Recursive Feature Elimination (RFE) using Random Forest to identify the most significant predictors. The study leveraged ensemble learning models, including XGBoost, Random Forest, Gradient Boosting, and LightGBM, and compared them with Decision Tree models. The results demonstrate that effluent volatile suspended solids (VSS) consistently held the highest predictive importance across all feature selection methods. Ensemble models significantly outperformed Decision Trees, with Gradient Boosting achieving the best predictive accuracy for TSS and total nitrogen (Mean Absolute Error (MAE): 3.667 : 97.53), XGBoost excelling in COD prediction with MAE and of 6.251 and 83. 41%, respectively, and XGBoost showing superior performance for BOD (MAE: 1.589 :79.64%). LightGBM yielded the highest precision in predicting total phosphate with MAE and a score of 0.230 and 28. 68%, respectively. Decision tree models consistently underperformed, exhibiting the highest error rates. These findings highlight the potential of AI-driven approaches in wastewater management to improve decision-making, regulatory compliance, and resource efficiency. However, limitations such as operational irregularities and seasonal variations remain challenges for further refinement.
AB - The integration of artificial intelligence (AI) in wastewater treatment management offers a promising approach to optimizing effluent quality predictions and enhancing operational efficiency. This study evaluates the performance of machine learning models in predicting key wastewater effluent parameters Chemical Oxygen Demand (COD), Biochemical Oxygen Demand (BOD), Total Suspended Solids (TSS), Total Effluent Nitrogen and Total Effluent Phosphorus. Three feature selection techniques were applied: SelectKBest, Mutual Information, and Recursive Feature Elimination (RFE) using Random Forest to identify the most significant predictors. The study leveraged ensemble learning models, including XGBoost, Random Forest, Gradient Boosting, and LightGBM, and compared them with Decision Tree models. The results demonstrate that effluent volatile suspended solids (VSS) consistently held the highest predictive importance across all feature selection methods. Ensemble models significantly outperformed Decision Trees, with Gradient Boosting achieving the best predictive accuracy for TSS and total nitrogen (Mean Absolute Error (MAE): 3.667 : 97.53), XGBoost excelling in COD prediction with MAE and of 6.251 and 83. 41%, respectively, and XGBoost showing superior performance for BOD (MAE: 1.589 :79.64%). LightGBM yielded the highest precision in predicting total phosphate with MAE and a score of 0.230 and 28. 68%, respectively. Decision tree models consistently underperformed, exhibiting the highest error rates. These findings highlight the potential of AI-driven approaches in wastewater management to improve decision-making, regulatory compliance, and resource efficiency. However, limitations such as operational irregularities and seasonal variations remain challenges for further refinement.
KW - Artificial intelligence
KW - Environmental engineering
KW - Feature selection
KW - Machine learning
KW - Waste water treatment plan
UR - http://www.scopus.com/inward/record.url?scp=105010639787&partnerID=8YFLogxK
U2 - 10.1038/s41598-025-07124-0
DO - 10.1038/s41598-025-07124-0
M3 - Article
C2 - 40659650
AN - SCOPUS:105010639787
SN - 2045-2322
VL - 15
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 25347
ER -