TY - JOUR
T1 - Gender identification using marginalised stacked denoising autoencoders on twitter data
AU - Al-Onazi, Badriyya B.
AU - Nour, Mohamed K.
AU - Alshamrani, Hassan
AU - Al Duhayyim, Mesfer
AU - Mohsen, Heba
AU - Abdelmageed, Amgad Atta
AU - GOUSE PASHA MOHAMMED, null
AU - ABU SARWAR ZAMANI, null
N1 - Publisher Copyright:
© 2023, Tech Science Press. All rights reserved.
PY - 2023
Y1 - 2023
N2 - Gender analysis of Twitter could reveal significant socio-cultural differences between female and male users. Efforts had been made to analyze and automatically infer gender formerly for more commonly spoken languages’ content, but, as we now know that limited work is being undertaken for Arabic. Most of the research works are done mainly for English and least amount of effort for non-English language. The study for Arabic demographic inference like gender is relatively uncommon for social networking users, especially for Twitter. Therefore, this study aims to design an optimal marginalized stacked denoising autoencoder for gender identification on Arabic Twitter (OMSDAE-GIAT) model. The presented OMSDAE-GIAR technique mainly concentrates on the identification and classification of gender exist in the Twitter data. To attain this, the OMSDAE- GIAT model derives initial stages of data pre-processing and word embedding. Next, the MSDAE model is exploited for the identification of gender into two classes namely male and female. In the final stage, the OMSDAE-GIAT technique uses enhanced bat optimization algorithm (EBOA) for parameter tuning process, showing the novelty of our work. The performance validation of the OMSDAE-GIAT model is inspected against an Arabic corpus dataset and the results are measured under distinct metrics. The comparison study reported the enhanced performance of the OMSDAE-GIAT model over other recent approaches.
AB - Gender analysis of Twitter could reveal significant socio-cultural differences between female and male users. Efforts had been made to analyze and automatically infer gender formerly for more commonly spoken languages’ content, but, as we now know that limited work is being undertaken for Arabic. Most of the research works are done mainly for English and least amount of effort for non-English language. The study for Arabic demographic inference like gender is relatively uncommon for social networking users, especially for Twitter. Therefore, this study aims to design an optimal marginalized stacked denoising autoencoder for gender identification on Arabic Twitter (OMSDAE-GIAT) model. The presented OMSDAE-GIAR technique mainly concentrates on the identification and classification of gender exist in the Twitter data. To attain this, the OMSDAE- GIAT model derives initial stages of data pre-processing and word embedding. Next, the MSDAE model is exploited for the identification of gender into two classes namely male and female. In the final stage, the OMSDAE-GIAT technique uses enhanced bat optimization algorithm (EBOA) for parameter tuning process, showing the novelty of our work. The performance validation of the OMSDAE-GIAT model is inspected against an Arabic corpus dataset and the results are measured under distinct metrics. The comparison study reported the enhanced performance of the OMSDAE-GIAT model over other recent approaches.
KW - Arabic corpus
KW - Arabic twitter
KW - Bat algorithm
KW - Gender identification
KW - Hybrid deep learning
KW - Social media
UR - http://www.scopus.com/inward/record.url?scp=85151137355&partnerID=8YFLogxK
U2 - 10.32604/iasc.2023.034623
DO - 10.32604/iasc.2023.034623
M3 - Article
AN - SCOPUS:85151137355
SN - 1079-8587
VL - 36
SP - 2529
EP - 2544
JO - Intelligent Automation and Soft Computing
JF - Intelligent Automation and Soft Computing
IS - 3
ER -