An Effective Approach for Rumor Detection of Arabic Tweets Using eXtreme Gradient Boosting Method

Abdu Gumaei; Mabrook S. Al-Rakhami; Mohammad Mehedi Hassan; Victor Hugo C. De Albuquerque; David Camacho

doi:10.1145/3461697

An Effective Approach for Rumor Detection of Arabic Tweets Using eXtreme Gradient Boosting Method

Abdu Gumaei
, Mabrook S. Al-Rakhami
, Mohammad Mehedi Hassan
, Victor Hugo C. De Albuquerque
, David Camacho

Research output: Contribution to journal › Article › peer-review

20 Scopus citations

Abstract

Twitter is currently one of the most popular microblogging platforms allowing people to post short messages, news, thoughts, and so on. The Twitter user community is growing very fast. It has an average of 328 million active accounts today, making it one of the most common media for getting information during any influential or important event. Because it is freely used by the public, some credibility checking is required, especially when it comes to events of high importance. Automatic rumor detection in Arabic tweets is a challenging task due to the changes in the structural and morphological nature of the Arabic language, which makes the detection of rumors more difficult than in other languages. In this article, we proposed an effective approach for rumor detection of Arabic tweets using an eXtreme gradient boosting (XGBoost) classifier. We conducted a set of experiments on a public dataset that contained a large number of rumor and non-rumor tweets. The model uses a comprehensive set of features, including content-based, user-based, and topic-based features, allowing one to look at credibility from different angles. The experimental results demonstrated that the proposed XGBoost-based approach achieves 97.18% accuracy on 60% of the dataset as a training set, which is the highest accuracy rate compared with the other methods used in recent related work.

Original language	English
Article number	3461697
Journal	ACM Transactions on Asian and Low-Resource Language Information Processing
Volume	21
Issue number	1
DOIs	https://doi.org/10.1145/3461697
State	Published - Jan 2022
Externally published	Yes

Keywords

Arabic
Rumor detection
Twitter
XGBoost method
machine learning

Access to Document

10.1145/3461697

Cite this

@article{b8165732b75b470086cb9133678bd347,

title = "An Effective Approach for Rumor Detection of Arabic Tweets Using eXtreme Gradient Boosting Method",

abstract = "Twitter is currently one of the most popular microblogging platforms allowing people to post short messages, news, thoughts, and so on. The Twitter user community is growing very fast. It has an average of 328 million active accounts today, making it one of the most common media for getting information during any influential or important event. Because it is freely used by the public, some credibility checking is required, especially when it comes to events of high importance. Automatic rumor detection in Arabic tweets is a challenging task due to the changes in the structural and morphological nature of the Arabic language, which makes the detection of rumors more difficult than in other languages. In this article, we proposed an effective approach for rumor detection of Arabic tweets using an eXtreme gradient boosting (XGBoost) classifier. We conducted a set of experiments on a public dataset that contained a large number of rumor and non-rumor tweets. The model uses a comprehensive set of features, including content-based, user-based, and topic-based features, allowing one to look at credibility from different angles. The experimental results demonstrated that the proposed XGBoost-based approach achieves 97.18\% accuracy on 60\% of the dataset as a training set, which is the highest accuracy rate compared with the other methods used in recent related work.",

keywords = "Arabic, Rumor detection, Twitter, XGBoost method, machine learning",

author = "Abdu Gumaei and Al-Rakhami, \{Mabrook S.\} and Hassan, \{Mohammad Mehedi\} and \{De Albuquerque\}, \{Victor Hugo C.\} and David Camacho",

note = "Publisher Copyright: {\textcopyright} 2022 Association for Computing Machinery.",

year = "2022",

month = jan,

doi = "10.1145/3461697",

language = "English",

volume = "21",

journal = "ACM Transactions on Asian and Low-Resource Language Information Processing",

issn = "2375-4699",

publisher = "Association for Computing Machinery (ACM)",

number = "1",

}

TY - JOUR

T1 - An Effective Approach for Rumor Detection of Arabic Tweets Using eXtreme Gradient Boosting Method

AU - Gumaei, Abdu

AU - Al-Rakhami, Mabrook S.

AU - Hassan, Mohammad Mehedi

AU - De Albuquerque, Victor Hugo C.

AU - Camacho, David

PY - 2022/1

Y1 - 2022/1

N2 - Twitter is currently one of the most popular microblogging platforms allowing people to post short messages, news, thoughts, and so on. The Twitter user community is growing very fast. It has an average of 328 million active accounts today, making it one of the most common media for getting information during any influential or important event. Because it is freely used by the public, some credibility checking is required, especially when it comes to events of high importance. Automatic rumor detection in Arabic tweets is a challenging task due to the changes in the structural and morphological nature of the Arabic language, which makes the detection of rumors more difficult than in other languages. In this article, we proposed an effective approach for rumor detection of Arabic tweets using an eXtreme gradient boosting (XGBoost) classifier. We conducted a set of experiments on a public dataset that contained a large number of rumor and non-rumor tweets. The model uses a comprehensive set of features, including content-based, user-based, and topic-based features, allowing one to look at credibility from different angles. The experimental results demonstrated that the proposed XGBoost-based approach achieves 97.18% accuracy on 60% of the dataset as a training set, which is the highest accuracy rate compared with the other methods used in recent related work.

AB - Twitter is currently one of the most popular microblogging platforms allowing people to post short messages, news, thoughts, and so on. The Twitter user community is growing very fast. It has an average of 328 million active accounts today, making it one of the most common media for getting information during any influential or important event. Because it is freely used by the public, some credibility checking is required, especially when it comes to events of high importance. Automatic rumor detection in Arabic tweets is a challenging task due to the changes in the structural and morphological nature of the Arabic language, which makes the detection of rumors more difficult than in other languages. In this article, we proposed an effective approach for rumor detection of Arabic tweets using an eXtreme gradient boosting (XGBoost) classifier. We conducted a set of experiments on a public dataset that contained a large number of rumor and non-rumor tweets. The model uses a comprehensive set of features, including content-based, user-based, and topic-based features, allowing one to look at credibility from different angles. The experimental results demonstrated that the proposed XGBoost-based approach achieves 97.18% accuracy on 60% of the dataset as a training set, which is the highest accuracy rate compared with the other methods used in recent related work.

KW - Arabic

KW - Rumor detection

KW - Twitter

KW - XGBoost method

KW - machine learning

UR - https://www.scopus.com/pages/publications/85124256038

U2 - 10.1145/3461697

DO - 10.1145/3461697

M3 - Article

AN - SCOPUS:85124256038

SN - 2375-4699

VL - 21

JO - ACM Transactions on Asian and Low-Resource Language Information Processing

JF - ACM Transactions on Asian and Low-Resource Language Information Processing

IS - 1

M1 - 3461697

ER -

An Effective Approach for Rumor Detection of Arabic Tweets Using eXtreme Gradient Boosting Method

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this