Abstract
With the continued development of social networks, the spreading of information has become faster than ever. Consequently, this has resulted in a problem with the reliability of the information, where any user can publish whatever he/she wants. Automated systems capable of detecting fake contents with similar striking speed as the information being disseminated are urgently required. Detecting rumors in Arabic language social networks has lagged behind the work on other languages, particularly in English. In this paper, we address the problem of detecting rumors in Arabic tweets. We used a set of features extracted from the user and the content. These features were analyzed to determine their significance. Semi-supervised expectation–maximization (E–M) was used to train the proposed system with topics of newsworthy tweets. A comparison with supervised Gaussian Naïve Bayes (NB) showed that our semi-supervised system, using a small base of labeled data, outperforms Gaussian NB achieving an accuracy of 78.6%. The performance of the unsupervised E–M depends on the initial values, and we achieved an F1 score of 80% in one of our experiments.
Original language | English |
---|---|
Article number | 104945 |
Journal | Knowledge-Based Systems |
Volume | 185 |
DOIs | |
State | Published - 1 Dec 2019 |
Externally published | Yes |
Keywords
- Arabic
- Expectation–maximization
- Rumor detection
- Semi-supervised
- Unsupervised