Skip to main navigation Skip to search Skip to main content

Translating or Stealing? Probing the Limits of Cross-lingual Plagiarism Detection Systems in Literary Texts

Research output: Contribution to journalArticlepeer-review

Abstract

This study compares three plagiarism detection systems (Rabin-Karp, KNN, and Word2Vec) to measure their effectiveness in detecting cross-lingual plagiarism in Arabic literary texts translated from English. The dataset consisted of an Arabic translation of Daly Walker’s ‘I am the Grass’ (2012) conducted by the authors and evaluated by three translators with experience of more than ten years. It is divided into 60 percent directly translated, 30 percent paraphrased, and 10 percent original content. Findings showed that KNN achieved the highest precision in detecting cross-lingual plagiarism (26.7%), while Word2Vec performed best with paraphrased content (16.7%). Additionally, Rabin-Karp was most reliable in detecting original content (80% precision); however, all three systems demonstrated low overall accuracy (23-26%). These findings highlight the limitations of current systems when applied to Arabic texts, primarily due to the language’s morpho-syntactic and lexical complexities. Given the limited scope of the study, as it analyzes a single text, it recommends expanding to multiple genres for broader generalizability. Furthermore, the study recommends the development of more sophisticated, hybrid plagiarism detection systems and the development of rich Arabic corpora to enhance their performance.

Original languageEnglish
Pages (from-to)393-418
Number of pages26
JournalInternational Journal of Arabic-English Studies
Volume26
Issue number1
DOIs
StatePublished - 2 Jan 2026

Keywords

  • Arabic
  • cross-lingual plagiarism
  • detection accuracy
  • detection precision
  • translation

Fingerprint

Dive into the research topics of 'Translating or Stealing? Probing the Limits of Cross-lingual Plagiarism Detection Systems in Literary Texts'. Together they form a unique fingerprint.

Cite this