Detection and classification of malicious software utilizing Max-Flows between system-call groups
Tóm tắt
In this work, we present a graph-based method for the detection and classification of malicious software samples utilizing the Max-Flows exhibited through their corresponding behavioral graphs. In the proposed approach, we utilize the Max-Flows exhibited in the behavioral graphs that represent the interaction of software samples with their host environment, in order to depict the flow of information between System-call Groups. Obtaining the System-call Dependency Graphs of the samples under consideration, we construct the corresponding Group Relation Graphs, and proceed with the construction of the so-called, Flow Maps, another representation of Group Relation Graphs, that depict the Max-Flows among its vertices. Additionally, we provide a detailed representation over the architecture and the core components of our proposed approach for malware detection and classification discussing also several technical aspects regarding its implementation and deployment. Finally, we conduct a series of five-fold cross validation experiments in order to evaluate the potentials of our proposed approach in detecting and classifying malicious samples discussing also the exhibited experimental results.
Tài liệu tham khảo
Babic, D., Reynaud, D., Song, D.: Malware analysis with tree automata inference. In: Proceedings of the 23rd International Conference on Computer Aided Verification (CAV’11), pp. 116–131 (2011)
Basole, S., Di Troia, F., Stamp, M.: Multifamily malware models. J. Comput. Virol. Hacking Tech. 1–14 (2020)
Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: Third International AAAI Conference on Weblogs and Social Media (2009)
Canzanese, R., Kam, M., Mancoridis, S.: Toward an automatic, online behavioral malware classification system. In: 2013 IEEE 7th International Conference on Self-Adaptive and Self-Organizing Systems, pp. 111–120. IEEE (2013)
Chaumette, S., Ly, O., Tabary R.: Automated extraction of polymorphic virus signatures using abstract interpretation. In: 2011 5th International Conference on IEEE Network and System Security (NSS) (2011)
Chysi, A., Nikolopoulos, S. D., Polenakis, I.: An Algorithmic framework for malicious software detection exploring structural characteristics of behavioral graphs. In: Proceedings of the 21st International Conference on Computer Systems and Technologies’ 20, pp. 43–50
Christodorescu, M., Jha, S., Seshia, A., Song, D., Bryant, R. E.: Semantics-aware malware detection. In: 2005 IEEE Symposium on Security and Privacy (S &P’05) (2005)
Christodorescu, M., Jha, S., Kruegel, C.: Mining specifications of malicious behavior. In: Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (2007)
Dantzig, G., Fulkerson, D.R.: On the max flow min cut theorem of networks. Linear Inequal. Rel. Syst. 38, 225–231 (2003)
Dinitz, Y.: Dinitz’ Algorithm: the original version and Even’s version. In: Goldreich, O., Rosenberg, A.L., Selman, A.L. (eds) Theoretical Computer Science: Essays in Memory of Shimon Even. Springer, Berlin, pp. 218–240 (2006)
Ding, Y., Xia, X., Chen, S., Li, Y.: A malware detection method based on family behavior graph. Comput. Sec. 73, 73–86 (2018)
Dounavi, H. M., Mpanti, A., Nikolopoulos, S. D., Polenakis, I.: Detection and classification of malicious software based on regional matching of temporal graphs. In: International Conference on Computer Systems and Technologies’ 21, pp. 28–33 (2021)
Dounavi, H.M., Mpanti, A., Nikolopoulos, S.D., Polenakis, I.: A graph-based framework for malicious software detection and classification utilizing temporal-graphs. J. Comput. Sec. 29(6), 651–688 (2021)
Edmonds, J., Karp, R.M.: Theoretical improvements in algorithmic efficiency for network flow problems. J. ACM 19, 248–264 (1972)
Eskandari, R., Shajari, M., Ghahfarokhi, M.M.: ERES: an extended regular expression signature for polymorphic worm detection. J. Comput. Virol. Hacking Tech. 15(3), 177–194 (2019)
Ford, L.R., Fulkerson, D.R.: Maximal flow through a network. In: Gessel, I., Rota, G. (eds.) Classic Papers in Combinatorics, pp. 243–248. Birkhäuser, Boston (1987)
Fredrikson, M., Jha, S. Christodorescu, M., Sailer, R., Yan, X.: Synthesizing near-optimal malware specifications from suspicious behaviors. In: 2010 IEEE symposium on IEEE security and privacy (SP), pp. 45–60 (2010)
Garg, V., Yadav, R. K.: Malware detection based on API calls frequency. In: 2019 4th International Conference on Information Systems and Computer Networks (ISCON), pp. 400–404. IEEE (2019)
Hashemi, H., Azmoodeh, A., Hamzeh, A., Hashemi, S.: Graph embedding as a new approach for unknown malware detection. J. Comput. Virol. Hacking Tech. 13(3), 153–166 (2017)
Hashemi, H., Hamzeh, A.: Visual malware detection using local malicious pattern. J. Comput. Virol. Hacking Tech. 15(1), 1–14 (2019)
Hassen,M., Chan, P.K.: Scalable function call graph-based malware classification. In: Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, pp. 239–248. ACM (2017)
Hu, X. , Chiueh, T., Shin, K. G.: Large-scale malware indexing using function-call graphs. In: Proceedings of the 16th ACM Conference on Computer and Communications Security (CCS’09), pp. 611–620 (2009)
John, T. S., Thomas, T., Emmanuel, S.: Graph convolutional networks for android malware detection with system call graphs. In: ISEA Conference on Security and Privacy (ISEA-ISAP), pp. 162–170. IEEE (2020)
Karim, M.E., Walenstein, A., Lakhotia, A., Parida, L.: Malware phylogeny generation using permutations of code. J. Comput. Virol. 1(1–2), 13–23 (2005)
Kim, H., Kim, J., Kim, Y., Kim, I., Kim, K.J., Kim, H.: Improvement of malware detection and classification using API call sequence alignment and visualization. Clust. Comput. 22(1), 921–929 (2019)
Kozachok, A.V., Kozachok, V.I.: Construction and evaluation of the new heuristic malware detection mechanism based on executable files static analysis. J. Comput. Virol. Hacking Tech. 14(3), 225–231 (2018)
Mathur, K., Hiranwal, S.: A survey on techniques in detection and analyzing malware executables. J. Adv. Res. Comput. Sci. Softw. Eng. 3, 22–428 (2013)
Ming, J., Xu, D., Wu, D.: MalwareHunt: semantics-based malware diffing speedup by normalized basic block memoization. J. Comput. Virol. Hacking Tech. 13(3), 167–178 (2017)
Mohaisen, A., West, A.G., Mankin, A., Alrawi, O.: Chatter: classifying malware families using system event ordering. In: 2014 IEEE Conference on Communications and Network Security, pp. 283–291. IEEE (2014)
Mpanti, A., Nikolopoulos, S.D., Polenakis, I.: A graph-based model for malicious software detection exploiting domination relations between system-call groups. In: Proceedings of the 19th International Conference on Computer Systems and Technologies, pp. 20–26 (2018)
Narra, U., Di Troia, F., Corrado, V.A., Austin, T.H., Stamp, M.: Clustering versus SVM for malware detection. J. Comput. Virol. Hacking Tech. 12(4), 213–224 (2016)
NetworkX, https://networkx.org/?fbclid=IwAR0mH_jbtWFRxbD5CFwTdWNVpRzE 7dIAA8Av5fqBu0eTPr1fH488wEJQN0w, Accessed January 2022
Newsome, J., Song, D.: Dynamic taint analysis for automatic detection, analysis, and signature generation of exploits on commodity software. In: Proceedings of the 12th Annual Network and Distributed System Security Symposium (NDSS05) (2005)
Nikolopoulos, S.D., Polenakis, I.: A graph-based model for malicious code detection exploiting dependencies of system-call groups. In: Proceedings of the 16th International Conference on Computer Systems and Technologies, pp. 228–235 (2015)
Nikolopoulos, S.D., Polenakis, I.: A graph-based model for malware detection and classification using system-call groups. J. Comput. Virol. Hacking Tech. 13(1), 29–46 (2017)
NumPy, https://numpy.org/?fbclid=IwAR2-lo-qq6QIzqofZPTWc45Qgy47m6XIF4xRpy U0rqLzzv5FSF2fJhzx8ok, Accessed January 2022
Project on GitHub. https://github.com/AchilleasAlvaroChysi/Thesis. Accessed Jan 2022
Rad, B.B., Masrom, M.: Metamorphic virus variants classification using opcode frequency histogram, arXiv preprintarXiv:1104.3228 (2011)
Rezaei,T., Hamze, A.: An efficient approach for malware detection using PE header specifications. In: 2020 6th International Conference on Web Research (ICWR), pp. 234–239. IEEE (2020)
Sami,A., Yadegari, B., Rahimi, H., Peiravian, N., Hashemi, S., Hamze, A.: Malware detection based on mining API calls. In: Proceedings of the 2010 ACM Symposium on Applied Computing, pp. 1020–1025 (2010)
SciPy, https://scipy.org/?fbclid=IwAR2Rv4cqea5hvnc8rn1y8lCRF1nKmPEQCTKzeq9PuDBOxAOAMxLEtks-wS8, Accessed January 2022
Suaboot, J., Tari, Z., Mahmood, A., Zomaya, A., Li, W.: Sub-curve HMM: a malware detection approach based on partial analysis of API call sequences. Comput. Sec. 92, 101773 (2020)
Szor, P., Ferrie, P.: Hunting for metamorphic. In: Virus Bulletin Conference (2001)
VirusTotal, https://www.virustotal.com/gui/home/upload, Accessed January 2022
Walenstein, A., Lakhotia, A.: The software similarity problem in malware analysis. 1 Internat. Begegnungs-und Forschungszentrum fur Informatik (2007)
Wüchner, T., Ochoa, M., Pretschner, A.: Robust and effective malware detection through quantitative data flow graph metrics. In: International Conference on Detection of Intrusions and Malware and Vulnerability Assessment, pp. 98–118. Springer, Cham (2015)
Wüchner, T., Ochoa, M., Pretschner, A.: Malware detection with quantitative data flow graphs. In: Proceedings of the 9th ACM symposium on Information, Computer and Communications Security, pp. 271–282 (2014)
Xiao, F., Sun, Y., Du, D., Li, X., Luo, M.: A novel malware classification method based on crucial behaviour. Math Probl Eng (2020)
Xu, M., Wu, L., Qi, S., Xu, J., Zhang, H., Ren, Y., Zheng, N.: A similarity metric method of obfuscated malware using function-call graph. J. Comput. Virol. Hacking Tech. 35–47 (2013)
You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: Proceedings of the 5th International Conference on Broadband and Wireless Computing, Communication and Applications (BWCCA’10), pp. 297–300 (2010)
Zhong, Y., Yamaki, H., Takakura, H.: A malware classification method based on similarity of function structure. In: 2012 IEEE/IPSJ 12th International Symposium on Applications and the Internet (pp. 256–261). IEEE (2012)