An efficient approach for textual data classification using deep learning

Abdullah Alqahtani; Habib Ullah Khan; Shtwai Alsubai; Mohemmed Sha; Ahmad Almadhor; Tayyab Iqbal; Sidra Abbas

doi:10.3389/fncom.2022.992296

An efficient approach for textual data classification using deep learning

Abdullah Alqahtani
, Habib Ullah Khan
, Shtwai Alsubai
, Mohemmed Sha
, Ahmad Almadhor
, Tayyab Iqbal
, Sidra Abbas

Research output: Contribution to journal › Article › peer-review

18 Scopus citations

Abstract

Text categorization is an effective activity that can be accomplished using a variety of classification algorithms. In machine learning, the classifier is built by learning the features of categories from a set of preset training data. Similarly, deep learning offers enormous benefits for text classification since they execute highly accurately with lower-level engineering and processing. This paper employs machine and deep learning techniques to classify textual data. Textual data contains much useless information that must be pre-processed. We clean the data, impute missing values, and eliminate the repeated columns. Next, we employ machine learning algorithms: logistic regression, random forest, K-nearest neighbors (KNN), and deep learning algorithms: long short-term memory (LSTM), artificial neural network (ANN), and gated recurrent unit (GRU) for classification. Results reveal that LSTM achieves 92% accuracy outperforming all other model and baseline studies.

Original language	English
Article number	992296
Journal	Frontiers in Computational Neuroscience
Volume	16
DOIs	https://doi.org/10.3389/fncom.2022.992296
State	Published - 15 Sep 2022

Keywords

deep learning
machine learning
text categorization
text classification
text data

Access to Document

10.3389/fncom.2022.992296

Cite this

@article{dc9698ff1561474a8c74d0a69897a1d1,

title = "An efficient approach for textual data classification using deep learning",

abstract = "Text categorization is an effective activity that can be accomplished using a variety of classification algorithms. In machine learning, the classifier is built by learning the features of categories from a set of preset training data. Similarly, deep learning offers enormous benefits for text classification since they execute highly accurately with lower-level engineering and processing. This paper employs machine and deep learning techniques to classify textual data. Textual data contains much useless information that must be pre-processed. We clean the data, impute missing values, and eliminate the repeated columns. Next, we employ machine learning algorithms: logistic regression, random forest, K-nearest neighbors (KNN), and deep learning algorithms: long short-term memory (LSTM), artificial neural network (ANN), and gated recurrent unit (GRU) for classification. Results reveal that LSTM achieves 92\% accuracy outperforming all other model and baseline studies.",

keywords = "deep learning, machine learning, text categorization, text classification, text data",

author = "Abdullah Alqahtani and \{Ullah Khan\}, Habib and Shtwai Alsubai and Mohemmed Sha and Ahmad Almadhor and Tayyab Iqbal and Sidra Abbas",

note = "Publisher Copyright: Copyright {\textcopyright} 2022 Alqahtani, Ullah Khan, Alsubai, Sha, Almadhor, Iqbal and Abbas.",

year = "2022",

month = sep,

day = "15",

doi = "10.3389/fncom.2022.992296",

language = "English",

volume = "16",

journal = "Frontiers in Computational Neuroscience",

issn = "1662-5188",

publisher = "Frontiers Media SA",

}

TY - JOUR

T1 - An efficient approach for textual data classification using deep learning

AU - Alqahtani, Abdullah

AU - Ullah Khan, Habib

AU - Alsubai, Shtwai

AU - Sha, Mohemmed

AU - Almadhor, Ahmad

AU - Iqbal, Tayyab

AU - Abbas, Sidra

PY - 2022/9/15

Y1 - 2022/9/15

N2 - Text categorization is an effective activity that can be accomplished using a variety of classification algorithms. In machine learning, the classifier is built by learning the features of categories from a set of preset training data. Similarly, deep learning offers enormous benefits for text classification since they execute highly accurately with lower-level engineering and processing. This paper employs machine and deep learning techniques to classify textual data. Textual data contains much useless information that must be pre-processed. We clean the data, impute missing values, and eliminate the repeated columns. Next, we employ machine learning algorithms: logistic regression, random forest, K-nearest neighbors (KNN), and deep learning algorithms: long short-term memory (LSTM), artificial neural network (ANN), and gated recurrent unit (GRU) for classification. Results reveal that LSTM achieves 92% accuracy outperforming all other model and baseline studies.

AB - Text categorization is an effective activity that can be accomplished using a variety of classification algorithms. In machine learning, the classifier is built by learning the features of categories from a set of preset training data. Similarly, deep learning offers enormous benefits for text classification since they execute highly accurately with lower-level engineering and processing. This paper employs machine and deep learning techniques to classify textual data. Textual data contains much useless information that must be pre-processed. We clean the data, impute missing values, and eliminate the repeated columns. Next, we employ machine learning algorithms: logistic regression, random forest, K-nearest neighbors (KNN), and deep learning algorithms: long short-term memory (LSTM), artificial neural network (ANN), and gated recurrent unit (GRU) for classification. Results reveal that LSTM achieves 92% accuracy outperforming all other model and baseline studies.

KW - deep learning

KW - machine learning

KW - text categorization

KW - text classification

KW - text data

UR - https://www.scopus.com/pages/publications/85139120374

U2 - 10.3389/fncom.2022.992296

DO - 10.3389/fncom.2022.992296

M3 - Article

AN - SCOPUS:85139120374

SN - 1662-5188

VL - 16

JO - Frontiers in Computational Neuroscience

JF - Frontiers in Computational Neuroscience

M1 - 992296

ER -

An efficient approach for textual data classification using deep learning

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this