TY - JOUR
T1 - Hybrid Bat and Salp Swarm Algorithm for Feature Selection and Classification of Crisis-Related Tweets in Social Networks
AU - Farooqui, Nafees Akhter
AU - Hasan, Mohammad Kamrul
AU - Noori, Mohammed Ahsan Raza
AU - Rahman, Abdul Hadi Abd
AU - Islam, Shayla
AU - Haleem, Mohammad
AU - Ahmad, Sheikh Fahad
AU - Khan, Asif
AU - Ahmed, Fatima Rayan Awad
AU - Babiker, Nissrein Babiker Mohammed
AU - Ahmed, Thowiba E.
AU - Khan, Atta Ur Rehman
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2024
Y1 - 2024
N2 - Twitter is a useful tool for effectively tracking and managing crisis-related incidents. However, due to many irrelevant features in textual data, the problem of high dimensionality arises, which eventually increases the computational cost and decreases classification performance. Thus, to handle such a problem, this work presents a Spark-based hybrid binary Bat (BBA) and binary Salp swarm algorithm (BSSA) named SBBASSA for feature selection and classification of crisis-related tweets. In the proposed technique, the hybridization of standard BBA and BSSA algorithms is performed to enhance their exploration capabilities, then the combined algorithm is implemented in parallel using Apache Spark framework to reduce the overall execution time during the feature selection process. A support vector machine (SVM) classifier is applied during the wrapper-based feature subset selection and classification. The performance of the proposed SBBASSA was analyzed on six benchmark crisis tweet datasets, namely Hurricane Sandy, Boston Bombings, Oklahoma Tornado, West Texas Explosion, Alberta Floods, and Queensland Floods, and then compared with standard BSSA, BBA, and binary particle swarm optimization (BPSO). Results showed that SBBASSA performed competently in the feature selection and classification, outperformed other algorithms in crisis tweet classification, and achieved the highest accuracy with the lowest feature set in a reduced execution time.
AB - Twitter is a useful tool for effectively tracking and managing crisis-related incidents. However, due to many irrelevant features in textual data, the problem of high dimensionality arises, which eventually increases the computational cost and decreases classification performance. Thus, to handle such a problem, this work presents a Spark-based hybrid binary Bat (BBA) and binary Salp swarm algorithm (BSSA) named SBBASSA for feature selection and classification of crisis-related tweets. In the proposed technique, the hybridization of standard BBA and BSSA algorithms is performed to enhance their exploration capabilities, then the combined algorithm is implemented in parallel using Apache Spark framework to reduce the overall execution time during the feature selection process. A support vector machine (SVM) classifier is applied during the wrapper-based feature subset selection and classification. The performance of the proposed SBBASSA was analyzed on six benchmark crisis tweet datasets, namely Hurricane Sandy, Boston Bombings, Oklahoma Tornado, West Texas Explosion, Alberta Floods, and Queensland Floods, and then compared with standard BSSA, BBA, and binary particle swarm optimization (BPSO). Results showed that SBBASSA performed competently in the feature selection and classification, outperformed other algorithms in crisis tweet classification, and achieved the highest accuracy with the lowest feature set in a reduced execution time.
KW - Apache spark
KW - bat algorithm
KW - crisis tweet classification
KW - feature selection
KW - salp swarm algorithm
UR - http://www.scopus.com/inward/record.url?scp=85197492736&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2024.3421571
DO - 10.1109/ACCESS.2024.3421571
M3 - Article
AN - SCOPUS:85197492736
SN - 2169-3536
VL - 12
SP - 103908
EP - 103920
JO - IEEE Access
JF - IEEE Access
ER -