Content authentication and tampering detection of Arabic text: an approach based on zero-watermarking and natural language processing

Anwer Mustafa Hilal; Fahd N. Al-Wesabi; Manar Ahmed Hamza; Mohammed Medani; Khalid Mahmood; Mohammad Mahzari

doi:10.1007/s10044-021-01032-5

Content authentication and tampering detection of Arabic text: an approach based on zero-watermarking and natural language processing

Anwer Mustafa Hilal
, Fahd N. Al-Wesabi
, Manar Ahmed Hamza
, Mohammed Medani
, Khalid Mahmood
, Mohammad Mahzari

English Language and Literature

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

Due to the rapid increase in exchange of text information via internet network, the security and the reliability of the digital content has become a major research issue. The main challenges faced by researchers are content authentication, integrity verification and tampering detection of digital contents. In this paper, these issues are addressed with great emphasis on text information, which is natural language dependent. Hence, a novel intelligent zero-watermarking approach is proposed for content authentication and tampering detection of Arabic text contents. In the proposed approach, both the embedding and extracting of the watermark are logically implemented, which causes no change on the digital text. This is achieved by using fourth-level-order and alphanumeric mechanism of Markov model as a soft computing technique for the analysis of Arabic text to obtain the features of the given text which is considered as the digital watermark. This digital watermark is later used for the detection of any tampering attack on the received Arabic text. An extensive set of experiments using four datasets of varying lengths proves the effectiveness of the proposed approach in terms of robustness, effectiveness and applicability under multiple random locations of insertion, reorder and deletion attacks. Compared with baseline approaches, the proposed approach has improved performance regarding watermark robustness and tampering detection accuracy.

Original language	English
Pages (from-to)	47-62
Number of pages	16
Journal	Pattern Analysis and Applications
Volume	25
Issue number	1
DOIs	https://doi.org/10.1007/s10044-021-01032-5
State	Published - Feb 2022

Keywords

Alphanumeric mechanism
Content authentication
Hidden Markov model
Tampering detection
Text analysis
Zero-watermarking

Access to Document

10.1007/s10044-021-01032-5

Cite this

@article{1ed65c38186941e9820b71a9f06b949a,

title = "Content authentication and tampering detection of Arabic text: an approach based on zero-watermarking and natural language processing",

abstract = "Due to the rapid increase in exchange of text information via internet network, the security and the reliability of the digital content has become a major research issue. The main challenges faced by researchers are content authentication, integrity verification and tampering detection of digital contents. In this paper, these issues are addressed with great emphasis on text information, which is natural language dependent. Hence, a novel intelligent zero-watermarking approach is proposed for content authentication and tampering detection of Arabic text contents. In the proposed approach, both the embedding and extracting of the watermark are logically implemented, which causes no change on the digital text. This is achieved by using fourth-level-order and alphanumeric mechanism of Markov model as a soft computing technique for the analysis of Arabic text to obtain the features of the given text which is considered as the digital watermark. This digital watermark is later used for the detection of any tampering attack on the received Arabic text. An extensive set of experiments using four datasets of varying lengths proves the effectiveness of the proposed approach in terms of robustness, effectiveness and applicability under multiple random locations of insertion, reorder and deletion attacks. Compared with baseline approaches, the proposed approach has improved performance regarding watermark robustness and tampering detection accuracy.",

keywords = "Alphanumeric mechanism, Content authentication, Hidden Markov model, Tampering detection, Text analysis, Zero-watermarking",

author = "Hilal, \{Anwer Mustafa\} and Al-Wesabi, \{Fahd N.\} and Hamza, \{Manar Ahmed\} and Mohammed Medani and Khalid Mahmood and Mohammad Mahzari",

note = "Publisher Copyright: {\textcopyright} 2021, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature.",

year = "2022",

month = feb,

doi = "10.1007/s10044-021-01032-5",

language = "English",

volume = "25",

pages = "47--62",

journal = "Pattern Analysis and Applications",

issn = "1433-7541",

publisher = "Springer London",

number = "1",

}

TY - JOUR

T1 - Content authentication and tampering detection of Arabic text

T2 - an approach based on zero-watermarking and natural language processing

AU - Hilal, Anwer Mustafa

AU - Al-Wesabi, Fahd N.

AU - Hamza, Manar Ahmed

AU - Medani, Mohammed

AU - Mahmood, Khalid

AU - Mahzari, Mohammad

PY - 2022/2

Y1 - 2022/2

N2 - Due to the rapid increase in exchange of text information via internet network, the security and the reliability of the digital content has become a major research issue. The main challenges faced by researchers are content authentication, integrity verification and tampering detection of digital contents. In this paper, these issues are addressed with great emphasis on text information, which is natural language dependent. Hence, a novel intelligent zero-watermarking approach is proposed for content authentication and tampering detection of Arabic text contents. In the proposed approach, both the embedding and extracting of the watermark are logically implemented, which causes no change on the digital text. This is achieved by using fourth-level-order and alphanumeric mechanism of Markov model as a soft computing technique for the analysis of Arabic text to obtain the features of the given text which is considered as the digital watermark. This digital watermark is later used for the detection of any tampering attack on the received Arabic text. An extensive set of experiments using four datasets of varying lengths proves the effectiveness of the proposed approach in terms of robustness, effectiveness and applicability under multiple random locations of insertion, reorder and deletion attacks. Compared with baseline approaches, the proposed approach has improved performance regarding watermark robustness and tampering detection accuracy.

AB - Due to the rapid increase in exchange of text information via internet network, the security and the reliability of the digital content has become a major research issue. The main challenges faced by researchers are content authentication, integrity verification and tampering detection of digital contents. In this paper, these issues are addressed with great emphasis on text information, which is natural language dependent. Hence, a novel intelligent zero-watermarking approach is proposed for content authentication and tampering detection of Arabic text contents. In the proposed approach, both the embedding and extracting of the watermark are logically implemented, which causes no change on the digital text. This is achieved by using fourth-level-order and alphanumeric mechanism of Markov model as a soft computing technique for the analysis of Arabic text to obtain the features of the given text which is considered as the digital watermark. This digital watermark is later used for the detection of any tampering attack on the received Arabic text. An extensive set of experiments using four datasets of varying lengths proves the effectiveness of the proposed approach in terms of robustness, effectiveness and applicability under multiple random locations of insertion, reorder and deletion attacks. Compared with baseline approaches, the proposed approach has improved performance regarding watermark robustness and tampering detection accuracy.

KW - Alphanumeric mechanism

KW - Content authentication

KW - Hidden Markov model

KW - Tampering detection

KW - Text analysis

KW - Zero-watermarking

UR - https://www.scopus.com/pages/publications/85117851810

U2 - 10.1007/s10044-021-01032-5

DO - 10.1007/s10044-021-01032-5

M3 - Article

AN - SCOPUS:85117851810

SN - 1433-7541

VL - 25

SP - 47

EP - 62

JO - Pattern Analysis and Applications

JF - Pattern Analysis and Applications

IS - 1

ER -

Content authentication and tampering detection of Arabic text: an approach based on zero-watermarking and natural language processing

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this