Abstract
This study compares three plagiarism detection systems (Rabin-Karp, KNN, and Word2Vec) to measure their effectiveness in detecting cross-lingual plagiarism in Arabic literary texts translated from English. The dataset consisted of an Arabic translation of Daly Walker’s ‘I am the Grass’ (2012) conducted by the authors and evaluated by three translators with experience of more than ten years. It is divided into 60 percent directly translated, 30 percent paraphrased, and 10 percent original content. Findings showed that KNN achieved the highest precision in detecting cross-lingual plagiarism (26.7%), while Word2Vec performed best with paraphrased content (16.7%). Additionally, Rabin-Karp was most reliable in detecting original content (80% precision); however, all three systems demonstrated low overall accuracy (23-26%). These findings highlight the limitations of current systems when applied to Arabic texts, primarily due to the language’s morpho-syntactic and lexical complexities. Given the limited scope of the study, as it analyzes a single text, it recommends expanding to multiple genres for broader generalizability. Furthermore, the study recommends the development of more sophisticated, hybrid plagiarism detection systems and the development of rich Arabic corpora to enhance their performance.
| Original language | English |
|---|---|
| Pages (from-to) | 393-418 |
| Number of pages | 26 |
| Journal | International Journal of Arabic-English Studies |
| Volume | 26 |
| Issue number | 1 |
| DOIs | |
| State | Published - 2 Jan 2026 |
Keywords
- Arabic
- cross-lingual plagiarism
- detection accuracy
- detection precision
- translation
Fingerprint
Dive into the research topics of 'Translating or Stealing? Probing the Limits of Cross-lingual Plagiarism Detection Systems in Literary Texts'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver