TY - JOUR
T1 - A Deep Reinforcement Learning Framework to Evade Black-Box Machine Learning Based IoT Malware Detectors Using GAN-Generated Influential Features
AU - Arif, Rahat Maqsood
AU - Aslam, Muhammad
AU - Al-Otaibi, Shaha
AU - Martinez-Enriquez, Ana Maria
AU - Saba, Tanzila
AU - Bahaj, Saeed Ali
AU - Rehman, Amjad
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2023
Y1 - 2023
N2 - In the internet of things (IoT) networks, machine learning (ML) is significantly used for malware and adversary detection. Recently, research has shown that adversarial attacks have put ML-based models at risk. This problem is exacerbated in an IoT environment because of the absence of adequate security measures. Consequently, it is crucial to evaluate the strength of such malware detectors using powerful adversarial samples. The existing adversarial sample generation strategies either rely on high-level image features or an unfiltered feature set, making it challenging to determine which feature modifications are crucial in evading malware detection systems, without compromising the malware functionality. This encourages us to propose an evasion framework named IF-MalEvade, based on Generative Adversarial Network (GAN) and Deep Reinforcement Learning (DRL) that effectively generates fully-working, malware samples with several effective perturbations such as header Section manipulation and benign bytes insertion. The DRL framework selects a few suitable action sequences to change malicious samples, thus allowing our malware samples to bypass various black-box ML based malware detectors and the detection search engines of VirusTotal, while maintaining the executability and malicious behavior of the original malware samples. The neural networks of GAN take in the unfiltered feature set of malware dataset and using minimax objective function yields a set of useful features that are subsequently used by the DRL agent to make effective changes. Experimental results illustrated that by utilizing the influential features in sequence of transformations, the adversarial samples generated by our model outperformed the state-of-the-art evasion models with an impressive evasion rate. Additionally, the detection rate of well-known machine learning models was also brought down to up to 97%. Furthermore, when the machine learning models were retrained using adversarial samples, a 35% increase in detection accuracy was observed.
AB - In the internet of things (IoT) networks, machine learning (ML) is significantly used for malware and adversary detection. Recently, research has shown that adversarial attacks have put ML-based models at risk. This problem is exacerbated in an IoT environment because of the absence of adequate security measures. Consequently, it is crucial to evaluate the strength of such malware detectors using powerful adversarial samples. The existing adversarial sample generation strategies either rely on high-level image features or an unfiltered feature set, making it challenging to determine which feature modifications are crucial in evading malware detection systems, without compromising the malware functionality. This encourages us to propose an evasion framework named IF-MalEvade, based on Generative Adversarial Network (GAN) and Deep Reinforcement Learning (DRL) that effectively generates fully-working, malware samples with several effective perturbations such as header Section manipulation and benign bytes insertion. The DRL framework selects a few suitable action sequences to change malicious samples, thus allowing our malware samples to bypass various black-box ML based malware detectors and the detection search engines of VirusTotal, while maintaining the executability and malicious behavior of the original malware samples. The neural networks of GAN take in the unfiltered feature set of malware dataset and using minimax objective function yields a set of useful features that are subsequently used by the DRL agent to make effective changes. Experimental results illustrated that by utilizing the influential features in sequence of transformations, the adversarial samples generated by our model outperformed the state-of-the-art evasion models with an impressive evasion rate. Additionally, the detection rate of well-known machine learning models was also brought down to up to 97%. Furthermore, when the machine learning models were retrained using adversarial samples, a 35% increase in detection accuracy was observed.
KW - Generative adversarial network
KW - adversarial attack
KW - deep reinforcement learning
KW - malware evasion
KW - portable executable PE malware
KW - technological development
UR - http://www.scopus.com/inward/record.url?scp=85178073677&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2023.3334645
DO - 10.1109/ACCESS.2023.3334645
M3 - Article
AN - SCOPUS:85178073677
SN - 2169-3536
VL - 11
SP - 133717
EP - 133729
JO - IEEE Access
JF - IEEE Access
ER -