Aara' - A system for mining the polarity of Saudi public opinion through e-newspaper comments

Aqil M. Azmi, Samah M. Alzanin

Research output: Contribution to journalArticlepeer-review

31 Scopus citations

Abstract

Aara' is a system for mining opinion polarity through the pool of comments that readers write anonymously at the online edition of Saudi newspapers. We use a nave Bayes classifier with a revised n-gram approach to extract the public opinion polarity, which is expressed in Arabic, classifying it into four categories. For training we manually marked the comments as belonging to one of the categories. All the words in the documents of the training set were removed except those with explicit connotations. After the training the words designated as vocabulary were classified into one of the categories. Our system carries out polarity classification over informal colloquial Arabic that is unstructured and with a reasonable proportion of spelling errors. The result of testing our system showed a macro-averaged precision of 86.5%, while the macro-averaged F-score was 84.5%. The accuracy of the system is 82%.

Original languageEnglish
Pages (from-to)398-410
Number of pages13
JournalJournal of Information Science
Volume40
Issue number3
DOIs
StatePublished - Jun 2014
Externally publishedYes

Keywords

  • Arabic NLP
  • Colloquial Arabic
  • Naive Bayes
  • Public sentiment
  • Revised n-gram

Fingerprint

Dive into the research topics of 'Aara' - A system for mining the polarity of Saudi public opinion through e-newspaper comments'. Together they form a unique fingerprint.

Cite this