TY - JOUR
T1 - Aliasing black box adversarial attack with joint self-attention distribution and confidence probability
AU - Liu, Jun
AU - Jin, Haoyu
AU - Xu, Guangxia
AU - Lin, Mingwei
AU - Wu, Tao
AU - Nour, Majid
AU - Alenezi, Fayadh
AU - Alhudhaif, Adi
AU - Polat, Kemal
N1 - Publisher Copyright:
© 2022 Elsevier Ltd
PY - 2023/3/15
Y1 - 2023/3/15
N2 - Deep neural networks (DNNs) are vulnerable to adversarial attacks, in which a small perturbation to samples can cause misclassification. However, how to select important words for textual attack models is a big challenge. Therefore, in this paper, an innovative score-based attack model is proposed to solve the important words selection problem for textual attack models. To this end, the generation of semantically adversarial examples in this model is adopted to mislead a text classification model. Then, this model integrates the self-attention mechanism and confidence probabilities for the selection of the important words. Moreover, an alternative model similar to the transfer attack is introduced to reflect the correlation degree of words inside the texts. Finally, adversarial training experimental results demonstrate the superiority of the proposed model.
AB - Deep neural networks (DNNs) are vulnerable to adversarial attacks, in which a small perturbation to samples can cause misclassification. However, how to select important words for textual attack models is a big challenge. Therefore, in this paper, an innovative score-based attack model is proposed to solve the important words selection problem for textual attack models. To this end, the generation of semantically adversarial examples in this model is adopted to mislead a text classification model. Then, this model integrates the self-attention mechanism and confidence probabilities for the selection of the important words. Moreover, an alternative model similar to the transfer attack is introduced to reflect the correlation degree of words inside the texts. Finally, adversarial training experimental results demonstrate the superiority of the proposed model.
KW - Adversarial attack
KW - Self-attention distribution
KW - Text classification
UR - http://www.scopus.com/inward/record.url?scp=85141261103&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2022.119110
DO - 10.1016/j.eswa.2022.119110
M3 - Article
AN - SCOPUS:85141261103
SN - 0957-4174
VL - 214
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 119110
ER -