AraEventCoref: An Arabic Event Coreference Dataset and LLM Benchmarks

Mohammed Aldawsari, Omer Dawood

Research output: Contribution to journalArticlepeer-review

Abstract

Event coreference resolution is a critical task in Natural Language Processing (NLP), enabling applications such as information extraction, text summarization, and question answering. However, resolving event coreference in Arabic presents unique challenges due to the language's rich morphology, complex syntax, and lack of annotated resources. This article introduces AraEventCoref, the first publicly available Arabic event coreference dataset, comprising 50 annotated news articles with 1,381 events and 159 coreference chains. The dataset's annotation agreement achieved a CoNLL score of 75.8%, ensuring high reliability across B3, MUC, and CEAFe metrics. Additionally, event triggers were annotated with an inter-annotator agreement of 96% using Cohen's Kappa, further validating dataset quality. To establish benchmarks, we developed a fine-tuned CamelBERT-msa model as a strong baseline and evaluated state-of-the-art Arabic large language models (LLMs) using both bilingual and Arabic-only prompts. Results demonstrate the effectiveness of fine-tuning for domain-specific adaptation and reveal the impact of bilingual prompting on LLM performance. By providing a high-quality dataset and benchmarking results, this work lays a foundation for advancing Arabic event coreference research and supports future developments in event relation extraction.

Original languageEnglish
Article number67
JournalACM Transactions on Asian and Low-Resource Language Information Processing
Volume24
Issue number7
DOIs
StatePublished - 10 Jul 2025

Keywords

  • Arabic event
  • Arabic event coreference
  • Arabic event relation extraction

Fingerprint

Dive into the research topics of 'AraEventCoref: An Arabic Event Coreference Dataset and LLM Benchmarks'. Together they form a unique fingerprint.

Cite this