Phishing Detection in Arabic SMS Messages using Natural Language Processing

Alya Ibrahim, Sarah Alyousef, Hayfa Alajmi, Rana Aldossari, Fatma Masmoudi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Mobile phone integration into daily life has elevated Short Message Service (SMS) to a crucial tool for communication. Users receive text messages from banks, electronic government services, businesses, and payment services to verify their identities. Which makes them a source of manipulation to gain access to personal data. This study proposes a technique that detects Arabic phishing messages using natural language processing and a random forest classifier. The performance of the random forest classifier is compared with other machine learning algorithms, namely, K-Nearest Neighbors (KNN), AdaBoost, and Logistic Regression. According to all evaluation matrices, the random forest classifier has outperformed other classifiers. The model was trained with 638 phishing messages and 4844 legitimate ones. The experimental outcomes indicate that the proposed approach has obtained an accuracy of 98.66%, 99.10% precision, 98.23% recall, and 98.67% F1 score.

Original languageEnglish
Title of host publicationProceedings - 2024 7th International Women in Data Science Conference at Prince Sultan University, WiDS-PSU 2024
EditorsAmjad Rehm, Ahmad Taher Azar, Tanzila Saba
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages141-146
Number of pages6
ISBN (Electronic)9798350395839
DOIs
StatePublished - 2024
Event7th International Women in Data Science Conference at Prince Sultan University, WiDS-PSU 2024 - Riyadh, Saudi Arabia
Duration: 3 Mar 20244 Mar 2024

Publication series

NameProceedings - 2024 7th International Women in Data Science Conference at Prince Sultan University, WiDS-PSU 2024

Conference

Conference7th International Women in Data Science Conference at Prince Sultan University, WiDS-PSU 2024
Country/TerritorySaudi Arabia
CityRiyadh
Period3/03/244/03/24

Keywords

  • cyber security
  • Machine learning
  • natural language processing
  • random forest classifier
  • SMS phishing

Fingerprint

Dive into the research topics of 'Phishing Detection in Arabic SMS Messages using Natural Language Processing'. Together they form a unique fingerprint.

Cite this