TY - JOUR
T1 - Optimization and comparison of machine learning methods in estimation of carbon dioxide loading in chemical solvents for environmental applications
AU - Chen, Liang
AU - Huang, Huan
AU - Thangavelu, Lakshmi
AU - Abdelbasset, Walid Kamal
AU - Bokov, Dmitry Olegovich
AU - Algarni, Mohammed
AU - Ghazali, Sami
AU - Alashwal, May
N1 - Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2022/3/1
Y1 - 2022/3/1
N2 - In this study, we developed a variety of machine learning ensemble models for predicting and correlating CO2 solubility in amino acid salt solutions containing different concentrations. The models were utilized to establish a relationship between process parameters and CO2 loading in the solvent. Indeed, the solitary model output was the amount of CO2 that was loaded into and dissolved in the chemical solvent. When it came to selecting estimators, we tried three different approaches to correlate the CO2 loading. Bagging and boosting models, both of which are subclasses of ensemble techniques are used in these models. When using ensemble techniques, a number of weak models are combined to build a strong and robust model for prediction of solubility values. There are a variety of models that are utilized including random forests (RF), extreme randomized trees (ERT), and boosted K-NN (with Adaboost). We repeated the procedure multiple times in order to obtain the best model, from which we could then establish the right hyper-parameters for each one of the models. Following optimization, the R2 scores for all three models above 0.9, suggesting that the models had high predictive performance. ERT had the highest R2 score, which was 0.999, among all companies. R2 of 0.992 was achieved by Random Forest, also we have Boosted KNN, which achieved an R2 of 0.998.
AB - In this study, we developed a variety of machine learning ensemble models for predicting and correlating CO2 solubility in amino acid salt solutions containing different concentrations. The models were utilized to establish a relationship between process parameters and CO2 loading in the solvent. Indeed, the solitary model output was the amount of CO2 that was loaded into and dissolved in the chemical solvent. When it came to selecting estimators, we tried three different approaches to correlate the CO2 loading. Bagging and boosting models, both of which are subclasses of ensemble techniques are used in these models. When using ensemble techniques, a number of weak models are combined to build a strong and robust model for prediction of solubility values. There are a variety of models that are utilized including random forests (RF), extreme randomized trees (ERT), and boosted K-NN (with Adaboost). We repeated the procedure multiple times in order to obtain the best model, from which we could then establish the right hyper-parameters for each one of the models. Following optimization, the R2 scores for all three models above 0.9, suggesting that the models had high predictive performance. ERT had the highest R2 score, which was 0.999, among all companies. R2 of 0.992 was achieved by Random Forest, also we have Boosted KNN, which achieved an R2 of 0.998.
KW - Absorption
KW - CO solubility
KW - Machine learning
KW - Modeling
KW - Purification
UR - http://www.scopus.com/inward/record.url?scp=85122915713&partnerID=8YFLogxK
U2 - 10.1016/j.molliq.2022.118513
DO - 10.1016/j.molliq.2022.118513
M3 - Article
AN - SCOPUS:85122915713
SN - 0167-7322
VL - 349
JO - Journal of Molecular Liquids
JF - Journal of Molecular Liquids
M1 - 118513
ER -