TY - GEN
T1 - Exposing the Limitations of Machine Learning for Malware Detection Under Concept Drift
AU - Abusnaina, Ahmed
AU - Anwar, Afsah
AU - Saad, Muhammad
AU - Alabduljabbar, Abdulrahman
AU - Jang, Rhongho
AU - Salem, Saeed
AU - Mohaisen, David
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.
PY - 2025
Y1 - 2025
N2 - The arms race between malware authors and defenders is characterized by (1) mutations to the malware samples and (2) model retraining to detect those mutations. Due to an exponential growth in the number of new malware samples reported per day (1.5 million [4]), detection frameworks’ reliance on model retraining naturally increased. Model retraining is the de facto approach to counter malware mutations. In this paper, we question the efficacy of machine learning in the context of malware detection by exposing various limitations in the retraining approaches. We show that model retraining only provides a marginal performance improvement for malicious sample detection while simultaneously degrading the benign sample detection performance. To address various issues in malware detection, we investigate the efficiency of several model retraining approaches. Our proposed approaches allow the malware detectors to retrain models in time to enable malware family emergence detection while concurrently monitoring the evolving patterns of malware family mutations.
AB - The arms race between malware authors and defenders is characterized by (1) mutations to the malware samples and (2) model retraining to detect those mutations. Due to an exponential growth in the number of new malware samples reported per day (1.5 million [4]), detection frameworks’ reliance on model retraining naturally increased. Model retraining is the de facto approach to counter malware mutations. In this paper, we question the efficacy of machine learning in the context of malware detection by exposing various limitations in the retraining approaches. We show that model retraining only provides a marginal performance improvement for malicious sample detection while simultaneously degrading the benign sample detection performance. To address various issues in malware detection, we investigate the efficiency of several model retraining approaches. Our proposed approaches allow the malware detectors to retrain models in time to enable malware family emergence detection while concurrently monitoring the evolving patterns of malware family mutations.
KW - Adversarial Machine Learning
KW - Robust Malware Detection
UR - http://www.scopus.com/inward/record.url?scp=85211937778&partnerID=8YFLogxK
U2 - 10.1007/978-981-96-0567-5_20
DO - 10.1007/978-981-96-0567-5_20
M3 - Conference contribution
AN - SCOPUS:85211937778
SN - 9789819605668
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 273
EP - 289
BT - Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings
A2 - Barhamgi, Mahmoud
A2 - Wang, Hua
A2 - Wang, Xin
PB - Springer Science and Business Media Deutschland GmbH
T2 - 25th International Conference on Web Information Systems Engineering, WISE 2024
Y2 - 2 December 2024 through 5 December 2024
ER -