Feature Impact on Sentiment Extraction of TEnglish Code-Mixed Movie Tweets

  • S. Padmaja
  • , M. Nikitha
  • , Sasidhar Bandu
  • , S. Sameen Fatima

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Sentiment extraction is a natural language processing task dealing with the detection and classification of sentiments in various monolingual and bilingual texts. In this context, the automation of extracting sentiments from social media text is one of the pertinent areas of research as there is an enormous noisy multilingual content. This work focuses on extracting sentiments for code-mixed Telugu–English (TEnglish) bilingual Roman script movie tweets extracted using Twitter API. Initially, every tweet in the dataset was annotated with the source language of all the words present and also the sentiment expressed in the code-mixed tweet. The annotated data was automated for sentiment extraction through machine learning-based approach. Sentiment classification was accomplished with features like character N-grams, emoticons, repetitive characters, intensifiers, and negation words using support vector machine classifier with radial basis function as it performs efficiently in high-dimensional feature vectors. The study was to focus on identifying the type of feature which has more impact in capturing sentiments. The results show that character N-grams, emoticons, and negation words are the features that affect the accuracy most.

Original languageEnglish
Title of host publicationSmart Computing Techniques and Applications - Proceedings of the 4th International Conference on Smart Computing and Informatics
EditorsSuresh Chandra Satapathy, Vikrant Bhateja, Margarita N. Favorskaya, T. Adilakshmi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages487-493
Number of pages7
ISBN (Print)9789811615016
DOIs
StatePublished - 2021
Externally publishedYes
Event4th International Conference on Smart Computing and Informatics, SCI 2020 - Hyderabad, India
Duration: 9 Oct 202010 Oct 2020

Publication series

NameSmart Innovation, Systems and Technologies
Volume224
ISSN (Print)2190-3018
ISSN (Electronic)2190-3026

Conference

Conference4th International Conference on Smart Computing and Informatics, SCI 2020
Country/TerritoryIndia
CityHyderabad
Period9/10/2010/10/20

Keywords

  • Code-mixed tweets
  • Natural language processing
  • Sentiment extraction

Fingerprint

Dive into the research topics of 'Feature Impact on Sentiment Extraction of TEnglish Code-Mixed Movie Tweets'. Together they form a unique fingerprint.

Cite this