Rumor detection in Arabic tweets using semi-supervised and unsupervised expectation–maximization

Samah M. Alzanin, Aqil M. Azmi

Research output: Contribution to journalArticlepeer-review

69 Scopus citations

Abstract

With the continued development of social networks, the spreading of information has become faster than ever. Consequently, this has resulted in a problem with the reliability of the information, where any user can publish whatever he/she wants. Automated systems capable of detecting fake contents with similar striking speed as the information being disseminated are urgently required. Detecting rumors in Arabic language social networks has lagged behind the work on other languages, particularly in English. In this paper, we address the problem of detecting rumors in Arabic tweets. We used a set of features extracted from the user and the content. These features were analyzed to determine their significance. Semi-supervised expectation–maximization (E–M) was used to train the proposed system with topics of newsworthy tweets. A comparison with supervised Gaussian Naïve Bayes (NB) showed that our semi-supervised system, using a small base of labeled data, outperforms Gaussian NB achieving an accuracy of 78.6%. The performance of the unsupervised E–M depends on the initial values, and we achieved an F1 score of 80% in one of our experiments.

Original languageEnglish
Article number104945
JournalKnowledge-Based Systems
Volume185
DOIs
StatePublished - 1 Dec 2019
Externally publishedYes

Keywords

  • Arabic
  • Expectation–maximization
  • Rumor detection
  • Semi-supervised
  • Twitter
  • Unsupervised

Fingerprint

Dive into the research topics of 'Rumor detection in Arabic tweets using semi-supervised and unsupervised expectation–maximization'. Together they form a unique fingerprint.

Cite this