TY - JOUR
T1 - Optimization and validation of drug solubility by development of advanced artificial intelligence models
AU - Liu, Yaoyang
AU - Ahmed Smait, Drai
AU - Yaseen Naser, Abbas
AU - M. A. Altalbawy, Farag
AU - Bahri, Hala
AU - Abdul Kadhim Ruhaima, Ali
AU - Zayad Fathallah, Thura
AU - Hadrawi, Salema K.
AU - Alsaddon, Refad E.
AU - Alshetaili, Abdullah
AU - Alsubaiyel, Amal M.
N1 - Publisher Copyright:
© 2022 Elsevier B.V.
PY - 2023/2/15
Y1 - 2023/2/15
N2 - Over the last ten years, the application of novel mathematical models of Machine Learning employed to model the solubility of drugs especially anticancer drugs, in supercritical carbon dioxide (ScCO2) system has gained remarkable popularity. In this research, three distinct ensemble models have been employed on the data as a novel method for busulfan as anticancer drug for the first time, based on decision trees, including Random Forest (RF), Gradient Boosting Trees (GBRT), and Extremely Randomized Tree (ERT) to predict the solubility of busulfan as an anticancer drug. The dataset has two input parameters, T = Temperature and P = Pressure, and Y = Solubility is the single output. After implementing and tuning these ensemble models' hyper parameters, the performance has been assessed through several metrics. All three models show R-squared score of more than 0.9, but in terms of RMSE, the error rates are 1.80E-04, 1.72E-04, and 1.03E-04 for RF, ERT, and GBRT models, respectively. Also, MAPE metrics 4.51E-01, 4.87E-01, and 3.62E-01 errors had found for RF, ERT, and GBRT models, respectively. GBRT has been selected as the best model due to the less rate of RMSE and MAPE. An analysis has also been performed to find the optimal amount of solubility, which can be considered the (x1 = 38.3, x2 = 333.1, Y = 1.36E-03) vector.
AB - Over the last ten years, the application of novel mathematical models of Machine Learning employed to model the solubility of drugs especially anticancer drugs, in supercritical carbon dioxide (ScCO2) system has gained remarkable popularity. In this research, three distinct ensemble models have been employed on the data as a novel method for busulfan as anticancer drug for the first time, based on decision trees, including Random Forest (RF), Gradient Boosting Trees (GBRT), and Extremely Randomized Tree (ERT) to predict the solubility of busulfan as an anticancer drug. The dataset has two input parameters, T = Temperature and P = Pressure, and Y = Solubility is the single output. After implementing and tuning these ensemble models' hyper parameters, the performance has been assessed through several metrics. All three models show R-squared score of more than 0.9, but in terms of RMSE, the error rates are 1.80E-04, 1.72E-04, and 1.03E-04 for RF, ERT, and GBRT models, respectively. Also, MAPE metrics 4.51E-01, 4.87E-01, and 3.62E-01 errors had found for RF, ERT, and GBRT models, respectively. GBRT has been selected as the best model due to the less rate of RMSE and MAPE. An analysis has also been performed to find the optimal amount of solubility, which can be considered the (x1 = 38.3, x2 = 333.1, Y = 1.36E-03) vector.
KW - Anticancer drug
KW - Machine learning
KW - Model validation
KW - Simulation
KW - Supercritical fluids
UR - http://www.scopus.com/inward/record.url?scp=85145769991&partnerID=8YFLogxK
U2 - 10.1016/j.molliq.2022.121113
DO - 10.1016/j.molliq.2022.121113
M3 - Article
AN - SCOPUS:85145769991
SN - 0167-7322
VL - 372
JO - Journal of Molecular Liquids
JF - Journal of Molecular Liquids
M1 - 121113
ER -