Exposing the Limitations of Machine Learning for Malware Detection Under Concept Drift

Ahmed Abusnaina; Afsah Anwar; Muhammad Saad; Abdulrahman Alabduljabbar; Rhongho Jang; Saeed Salem; David Mohaisen

doi:10.1007/978-981-96-0567-5_20

Exposing the Limitations of Machine Learning for Malware Detection Under Concept Drift

Ahmed Abusnaina, Afsah Anwar, Muhammad Saad, Abdulrahman Alabduljabbar, Rhongho Jang, Saeed Salem, David Mohaisen

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

The arms race between malware authors and defenders is characterized by (1) mutations to the malware samples and (2) model retraining to detect those mutations. Due to an exponential growth in the number of new malware samples reported per day (1.5 million [4]), detection frameworks’ reliance on model retraining naturally increased. Model retraining is the de facto approach to counter malware mutations. In this paper, we question the efficacy of machine learning in the context of malware detection by exposing various limitations in the retraining approaches. We show that model retraining only provides a marginal performance improvement for malicious sample detection while simultaneously degrading the benign sample detection performance. To address various issues in malware detection, we investigate the efficiency of several model retraining approaches. Our proposed approaches allow the malware detectors to retrain models in time to enable malware family emergence detection while concurrently monitoring the evolving patterns of malware family mutations.

Original language	English
Title of host publication	Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings
Editors	Mahmoud Barhamgi, Hua Wang, Xin Wang
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	273-289
Number of pages	17
ISBN (Print)	9789819605668
DOIs	https://doi.org/10.1007/978-981-96-0567-5_20
State	Published - 2025
Externally published	Yes
Event	25th International Conference on Web Information Systems Engineering, WISE 2024 - Doha, Qatar Duration: 2 Dec 2024 → 5 Dec 2024

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	15437 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	25th International Conference on Web Information Systems Engineering, WISE 2024
Country/Territory	Qatar
City	Doha
Period	2/12/24 → 5/12/24

Keywords

Adversarial Machine Learning
Robust Malware Detection

Access to Document

10.1007/978-981-96-0567-5_20

Cite this

Abusnaina, A., Anwar, A., Saad, M., Alabduljabbar, A., Jang, R., Salem, S., & Mohaisen, D. (2025). Exposing the Limitations of Machine Learning for Malware Detection Under Concept Drift. In M. Barhamgi, H. Wang, & X. Wang (Eds.), Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings (pp. 273-289). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 15437 LNCS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-96-0567-5_20

Abusnaina, Ahmed ; Anwar, Afsah ; Saad, Muhammad et al. / Exposing the Limitations of Machine Learning for Malware Detection Under Concept Drift. Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings. editor / Mahmoud Barhamgi ; Hua Wang ; Xin Wang. Springer Science and Business Media Deutschland GmbH, 2025. pp. 273-289 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{598d82b8e76144e89a173a0b58c94ebd,

title = "Exposing the Limitations of Machine Learning for Malware Detection Under Concept Drift",

abstract = "The arms race between malware authors and defenders is characterized by (1) mutations to the malware samples and (2) model retraining to detect those mutations. Due to an exponential growth in the number of new malware samples reported per day (1.5 million [4]), detection frameworks{\textquoteright} reliance on model retraining naturally increased. Model retraining is the de facto approach to counter malware mutations. In this paper, we question the efficacy of machine learning in the context of malware detection by exposing various limitations in the retraining approaches. We show that model retraining only provides a marginal performance improvement for malicious sample detection while simultaneously degrading the benign sample detection performance. To address various issues in malware detection, we investigate the efficiency of several model retraining approaches. Our proposed approaches allow the malware detectors to retrain models in time to enable malware family emergence detection while concurrently monitoring the evolving patterns of malware family mutations.",

keywords = "Adversarial Machine Learning, Robust Malware Detection",

author = "Ahmed Abusnaina and Afsah Anwar and Muhammad Saad and Abdulrahman Alabduljabbar and Rhongho Jang and Saeed Salem and David Mohaisen",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.; 25th International Conference on Web Information Systems Engineering, WISE 2024 ; Conference date: 02-12-2024 Through 05-12-2024",

year = "2025",

doi = "10.1007/978-981-96-0567-5\_20",

language = "English",

isbn = "9789819605668",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "273--289",

editor = "Mahmoud Barhamgi and Hua Wang and Xin Wang",

booktitle = "Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings",

address = "Germany",

}

Abusnaina, A, Anwar, A, Saad, M, Alabduljabbar, A, Jang, R, Salem, S & Mohaisen, D 2025, Exposing the Limitations of Machine Learning for Malware Detection Under Concept Drift. in M Barhamgi, H Wang & X Wang (eds), Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 15437 LNCS, Springer Science and Business Media Deutschland GmbH, pp. 273-289, 25th International Conference on Web Information Systems Engineering, WISE 2024, Doha, Qatar, 2/12/24. https://doi.org/10.1007/978-981-96-0567-5_20

Exposing the Limitations of Machine Learning for Malware Detection Under Concept Drift. / Abusnaina, Ahmed; Anwar, Afsah; Saad, Muhammad et al.
Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings. ed. / Mahmoud Barhamgi; Hua Wang; Xin Wang. Springer Science and Business Media Deutschland GmbH, 2025. p. 273-289 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 15437 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - Exposing the Limitations of Machine Learning for Malware Detection Under Concept Drift

AU - Abusnaina, Ahmed

AU - Anwar, Afsah

AU - Saad, Muhammad

AU - Alabduljabbar, Abdulrahman

AU - Jang, Rhongho

AU - Salem, Saeed

AU - Mohaisen, David

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

PY - 2025

Y1 - 2025

N2 - The arms race between malware authors and defenders is characterized by (1) mutations to the malware samples and (2) model retraining to detect those mutations. Due to an exponential growth in the number of new malware samples reported per day (1.5 million [4]), detection frameworks’ reliance on model retraining naturally increased. Model retraining is the de facto approach to counter malware mutations. In this paper, we question the efficacy of machine learning in the context of malware detection by exposing various limitations in the retraining approaches. We show that model retraining only provides a marginal performance improvement for malicious sample detection while simultaneously degrading the benign sample detection performance. To address various issues in malware detection, we investigate the efficiency of several model retraining approaches. Our proposed approaches allow the malware detectors to retrain models in time to enable malware family emergence detection while concurrently monitoring the evolving patterns of malware family mutations.

AB - The arms race between malware authors and defenders is characterized by (1) mutations to the malware samples and (2) model retraining to detect those mutations. Due to an exponential growth in the number of new malware samples reported per day (1.5 million [4]), detection frameworks’ reliance on model retraining naturally increased. Model retraining is the de facto approach to counter malware mutations. In this paper, we question the efficacy of machine learning in the context of malware detection by exposing various limitations in the retraining approaches. We show that model retraining only provides a marginal performance improvement for malicious sample detection while simultaneously degrading the benign sample detection performance. To address various issues in malware detection, we investigate the efficiency of several model retraining approaches. Our proposed approaches allow the malware detectors to retrain models in time to enable malware family emergence detection while concurrently monitoring the evolving patterns of malware family mutations.

KW - Adversarial Machine Learning

KW - Robust Malware Detection

UR - http://www.scopus.com/inward/record.url?scp=85211937778&partnerID=8YFLogxK

U2 - 10.1007/978-981-96-0567-5_20

DO - 10.1007/978-981-96-0567-5_20

M3 - Conference contribution

AN - SCOPUS:85211937778

SN - 9789819605668

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 273

EP - 289

BT - Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings

A2 - Barhamgi, Mahmoud

A2 - Wang, Hua

A2 - Wang, Xin

PB - Springer Science and Business Media Deutschland GmbH

T2 - 25th International Conference on Web Information Systems Engineering, WISE 2024

Y2 - 2 December 2024 through 5 December 2024

ER -

Abusnaina A, Anwar A, Saad M, Alabduljabbar A, Jang R, Salem S et al. Exposing the Limitations of Machine Learning for Malware Detection Under Concept Drift. In Barhamgi M, Wang H, Wang X, editors, Web Information Systems Engineering – WISE 2024 - 25th International Conference, Proceedings. Springer Science and Business Media Deutschland GmbH. 2025. p. 273-289. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-981-96-0567-5_20