CASCADE CLASSIFICATION ALGORITHM FOR DETECTING MALICIOUS SOFTWARE BY STATIC ANALYSIS
Abstract
A study is presented on the development and experimental validation of a two-level cascading architecture for static classification of Portable Executable (PE) format executable files. The aim of the work is to reduce computing costs without compromising the quality of malware detection. At the first level of the cascade, a decision tree model is used, trained on the ten most informative features, providing a high completeness of Recall 0.990 detection with an acceptable error of 1 kind. The second level is implemented by the random forest model on forty features and is intended for clarifying classification, reaching the metrics Precision 0.988 and Recall 0.987 with an F1 measure of 0.988. The classification threshold at the first level was established empirically, taking into account the minimization of errors of the second kind, while at the second level the optimal threshold value was determined by the Juden index, which provides a balanced ratio of sensitivity and specificity. Experiments on a representative sample have shown that with a malicious traffic fraction of < 20%, the proposed cascade reduces the average analysis time of one object by 5-12% compared to the 40-feature model while maintaining comparable classification quality. The time limit of the cascade, = 20.6%, is analytically derived, confirmed by empirical data. The practical significance of the work lies in the possibility of integrating the proposed algorithm into antivirus gateways and endpoint protection tools, where fast response and high completeness of detection are required during mass scanning of mostly legitimate code.
References
1. Schultz M.G., Eskin E., Zadok E., Stolfo S.J. Data mining methods for detection of new malicious exe-cutables, Proc. IEEE Symp. Security and Privacy (S&P), 2001, pp. 38-49.
2. Kuang H., Wang J., Li R., Feng C., & Zhang X. Automated Data-Processing Function Identification Using Deep Neural Network, IEEE Access, 2020, Vol. 8, pp. 55411-55423. doi: 10.1109/ACCESS.2020.2981537.
3. Ghanem K., Kherbache Z., Ourdighi O. Enhancing Adversarial Examples for Evading Malware Detec-tion Systems: A Memetic Algorithm Approach, IJCNIS, 2025, Vol. 17, No. 1, pp. 1-16. DOI: 10.5815/ijcnis.2025.01.01.
4. Microsoft. Microsoft Portable Executable and Common Object File Format Specification. Available at: https://learn.microsoft.com/ru-ru/windows/win32/ debug/pe-format (accessed 26 May 2025).
5. Kozachok A.V., Matovykh S.S. Strukturnaya model' faylov formata Portable Executable soderzhashchikh vredonosnyy kod [Structural model of Portable Executable files containing malicious code], Problemy informatsionnoy bezopasnosti. Komp'yuternye sistemy [Problems of information security. Computer systems], 2025, No. 2, pp. 41-59. DOI: 10.48612/jisp/pdu2-fvxz-g5d3.
6. Rúa E.A., Bulut I. Machine Learning-Based Secure Malware Detection Using Features from Binary Executable Headers, European Symposium on Research in Computer Security. Springer, 2025,
pp. 204-216. DOI: 10.1007/978-3-031-82362-6_12.
7. Al Balawi M., Alnabhan M. Generative AI for Advanced Malware Detection, 4th Intelligent Systems Conference (IntelliSys). IEEE, 2024, pp. 204-216. DOI: 10.1109/ICSC63108.2024. 10895965.
8. Petrean D.E., Potolea R., Oprisa C. Packed Code Detection Using Shannon Entropy and Homomorphic Encrypted Executables, Proceedings of the 20th International Conference on Intelligent Computer Communication and Processing. IEEE, 2024, pp. 01-08. DOI: 10.1109/ICCP63557. 2024.10793050.
9. Mahato A., Majumdar R., Ghosh S.K. Feature-Driven Malware Detection using Cascade Machine Learning Models, SN Computer Science, 2025, Vol. 6, No. 7, pp. 794. Available at: https://doi.org/10.1007/s42979-025-04342-1.
10. Alizada Adil and Ragab Hassen Hani. Pextract: A Light-Weight Static Feature Extractor for Windows Portable Executable Files, SSRN, 2025. Available at: https://ssrn.com/abstract=5165659 (accessed 26 May 2025).
11. Kumar S.S., Shetty J. Malicious PE File Detection Using Machine Learning: An Analysis of Header Features, COSMIC. IEEE, 2024, pp. 66-71. DOI: 10.1109/ COSMIC63293.2024.10871898.
12. Rizwan M., Ali E., Batool N. Assessing Concept Drift in Malware: A Comprehensive Review and Anal-ysis, IBCAST. IEEE, 2024, pp. 564-569. DOI: 10.1109/IBCAST61650. 2024.10876901.
13. Schubert Kabban C.M., Graham S.R. Malware Classification through Abstract Syntax Trees and
L-moments, Computers & Security, 2025, Vol. 133, Article ID: 104082. DOI: 10.1016/j.cose.2024.104082.
14. Canbek G., Temizel T.T., Sagiroglu S. PToPI: A comprehensive review, analysis, and knowledge repre-sentation of binary classification performance measures/metrics, SN Computer Science, 2022,
Vol. 4, Article No. 13. DOI: 10.1007/s42979-022-01409-1.
15. A survey of machine learning methods and challenges for Windows malware classification. Available at: https://arxiv.org/abs/2006.09271 (accessed 28 May 2025).
16. Jusoh R., Firdaus A., Anwar S., Osman M.Z. Malware detection using static analysis in Android: a re-view of FeCO (features, classification, and obfuscation), PeerJ Computer Science, 2021, Vol. 7, Article ID: e522. DOI 10.7717/peerj-cs.522.
17. Kumar S., Janet B., Neelakantan S. Identification of malware families using stacking of textural features and machine learning, Expert Systems with Applications, 2022, Vol. 204, Article ID: 117635. Available at: https://doi.org/10.1016/j.eswa.2022.118073.
18. Lad S.S., Adamuthe A.C. Improved deep learning model for static PE files malware detection and classi-fication, International Journal of Computer Network and Information Security, 2022, Vol. 14, No. 2, pp. 14-26.
19. Ravindra Babu S., Leisha R., Medows K.J. Unveiling Powerful Machine Learning Strategies for Detect-ing Malware in Modern Digital Environment, Lecture Notes on Intelligent Computing and Data Science. Springer, 2024, Vol. 874, pp. 277-286. ISBN978-3-031-50886-8. DOI: 10.1007/978-3-031-50887-5_28.
20. VirusShare.com. A collection of malware samples for research purposes. Available at: https://virusshare.com/ (accessed 26 May 2025).
21. Cohen A., Nissim N., Rokach L., Elovici Y. SFEM: Structural feature extraction methodology for the detection of malicious office documents using machine learning methods, Expert Systems with Applica-tions, 2016, Vol. 64, pp. 324-338. Available at: https://doi.org/10.1016/j.eswa.2016.07.010.
22. Damaševičius R., Venčkauskas A., Toldinas J. Ensemble-based classification using neural networks and machine learning models for Windows PE malware detection, Electronics, 2021, Vol. 10, No. 4, Art. 485. Available at: https://doi.org/10.3390/electronics10040485.
23. Muralidharan T., Cohen A., Gerson N., Alazab M. File packing from the malware perspective: Tech-niques, analysis approaches, and directions for enhancements, ACM Computing Surveys, Vol. 55, No. 5, Article 108. Available at: https://doi.org/10.1145/3530810.
24. Nie S., Zhu X., Xiong F., Zhang N. Network learning and propagation dynamics analysis, Frontiers in Physics, 2025, Vol. 13, Article ID: 1609957. DOI: 10.3389/fphy.2025.1609957.
25. Saxe J., Berlin K. Deep neural network-based malware detection using two-dimensional binary program features, 10th International Conference on Malicious and Unwanted Software (MALWARE). IEEE, 2015, pp. 11-20. DOI: 10.1109/MALWARE.2015.7413680.
26. Sahs J., Khan L. A machine learning approach to Android malware detection, Published in 2012 Euro-pean Intelligence and Security Informatics Conference. IEEE, 2012, pp. 141-147. DOI: 10.1109/EISIC.2012.34.
27. Ucci D., Aniello L., Baldoni R. Survey of machine learning techniques for malware analysis, Computers & Security, 2019, Vol. 81, pp. 123-147. doi.org/10.1016/j.cose.2018.11.001.








