A Survey of Natural Language Processing for Classification of Saudi Arabic Dialect: Advancements, Opportunities, and Challenges

Sulaiman Aftan, Yu Zhuang, Ahmad O. Aseeri, Habib Shah

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Multiple areas of artificial intelligence, such as machine learning, deep neural networks, and large language models (LLMs), have greatly influenced human communication domains via natural language processing (NLP) technologies, including text generation, translation, text analysis, sentiment analysis, etc., across various languages including English, Arabic, and others. Arabic is particularly influential among these languages, with approximately 300 million speakers worldwide, leading to Arabic Natural Language Processing (ANLP). ANLP has emerged as a successful NLP area, particularly in dialect classification, generation, and translation, with the Saudi Dialect (SD) being a notable focus due to its value in the Middle East. Various researchers have effectively utilized different types of NLP architectures across different domains, ranging from everyday use to social and business platforms, to address the challenges and applications associated with SD. This survey aims to review and summarize five years of research in this field, from 2020 to 2024, showcasing the successes achieved and identifying research opportunities to enhance the understanding and utilization of NLP in diverse SD scenarios. Additionally, the survey will shed light on the challenges encountered in acquiring SD datasets for efficient analysis using different NLP methodologies.

Original languageEnglish
Title of host publicationEmerging Technologies in Computing - 7th EAI International Conference, iCETiC 2024, Proceedings
EditorsMahdi H. Miraz, Mahdi H. Miraz, Andrew Ware, Garfield Southall, Maaruf Ali
PublisherSpringer Science and Business Media Deutschland GmbH
Pages105-124
Number of pages20
ISBN (Print)9783031926242
DOIs
StatePublished - 2026
Event7th EAI International Conference on Emerging Technologies in Computing, iCETiC 2024 - Essex, United Kingdom
Duration: 15 Aug 202416 Aug 2024

Publication series

NameLecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST
Volume623 LNICST
ISSN (Print)1867-8211
ISSN (Electronic)1867-822X

Conference

Conference7th EAI International Conference on Emerging Technologies in Computing, iCETiC 2024
Country/TerritoryUnited Kingdom
CityEssex
Period15/08/2416/08/24

Keywords

  • DL
  • NLP Survey
  • Saudi and Arabic Dialect Classification

Fingerprint

Dive into the research topics of 'A Survey of Natural Language Processing for Classification of Saudi Arabic Dialect: Advancements, Opportunities, and Challenges'. Together they form a unique fingerprint.

Cite this