Concept drift and cross-device behavior: Challenges and implications for effective android malware detection

Computers & Security - Tập 120 - Trang 102757 - 2022
Alejandro Guerra-Manzanares1, Marcin Luckner2, Hayretdin Bahsi1
1Department of Software Science, Tallinn University of Technology, Estonia
2Faculty of Mathematics and Information Science, Warsaw University of Technology, Poland

Tài liệu tham khảo

Aggarwal, 2015 Allix, 2015, Are your training datasets yet relevant?, 51 Altmann, 2010, Permutation importance: a corrected feature importance measure, Bioinformatics, 26, 1340, 10.1093/bioinformatics/btq134 Alzaylaee, 2017, Emulator vs. real phone: android malware detection using machine learning, 65 Alzaylaee, 2020, DL-Droid: deep learning based android malware detection using real devices, Comput. Secur., 89, 101663, 10.1016/j.cose.2019.101663 Amin, 2016, Behavioral malware detection approaches for android, 1 Android. Run apps on the android emulator. https://developer.android.com/studio/run/emulator; 2021. Arp, 2014, DREBIN: effective and explainable detection of android malware in your pocket, vol. 14, 23 Barbero F., Pendlebury F., Pierazzi F., Cavallaro L.. Transcending transcend: revisiting malware classification with conformal evaluation. arXiv preprint arXiv:201003856 2020. Breiman, 2001, Random forests, Mach. Learn., 45, 5, 10.1023/A:1010933404324 Burguera, 2011, Crowdroid: behavior-based malware detection system for android, 15 Cai, 2020, Assessing and improving malware detection sustainability through app evolution studies, ACM Trans. Softw. Eng. Methodol. (TOSEM), 29, 1, 10.1145/3371924 Cai, 2019, DroidCat: effective android malware detection and categorization via app-level profiling, IEEE Trans. Inf. Forensics Secur., 14, 1455, 10.1109/TIFS.2018.2879302 Cai, 2021, Jowmdroid: android malware detection based on feature weighting with joint optimization of weight-mapping and classifier parameters, Comput. Secur., 100, 102086, 10.1016/j.cose.2020.102086 Casolare, 2021, Dynamic mobile malware detection through system call-based image representation, J. Wirel. Mob. Netw. Ubiquitous Comput. Dependable Appl., 12, 44 Dimjašević, 2016, Evaluation of android malware detection based on system calls, 1 Fedler, 2013, On the effectiveness of malware protection on android, Fraunhofer AISEC, 45 Feng, 2018, A novel dynamic android malware detection system with ensemble learning, IEEE Access, 6, 30996, 10.1109/ACCESS.2018.2844349 Gao, 2021, Gdroid: android malware detection and classification with graph convolutional network, Comput. Secur., 106, 102264, 10.1016/j.cose.2021.102264 Google. Google play protect. https://developers.google.com/android/play-protect; 2021. Gözüaçk, 2020, Concept learning using one-class classifiers for implicit drift detection in evolving data streams, Artif. Intell. Rev. Guerra-Manzanares, 2021, Kronodroid: time-based hybrid-featured dataset for effective android malware detection and characterization, Comput. Secur., 102399, 10.1016/j.cose.2021.102399 Guerra-Manzanares, 2019, Differences in android behavior between real device and emulator: a malware detection perspective, 399 Guerra-Manzanares, 2022, Android malware concept drift using system calls, Under Rev Guerra-Manzanares, 2019, Time-frame analysis of system calls behavior in machine learning-based mobile malware detection, 1 Guerra-Manzanares, 2019, In-depth feature selection and ranking for automated detection of mobile malware, 274 Han, 2020, Android malware detection via (somewhat) robust irreversible feature transformations, IEEE Trans. Inf. Forensics Secur., 15, 3511, 10.1109/TIFS.2020.2975932 Iadarola, 2021, Towards an interpretable deep learning model for mobile malware detection and family identification, Comput. Secur., 105, 102198, 10.1016/j.cose.2021.102198 Irolla, 2018, The duplication issue within the Drebin dataset, J. Comput. Virol. Hack. Tech., 14, 245, 10.1007/s11416-018-0316-z Jerbi, 2020, On the use of artificial malicious patterns for android malware detection, Comput. Secur., 92, 101743, 10.1016/j.cose.2020.101743 Jordaney, 2017, Transcend: detecting concept drift in malware classification models, 625 Karn, 2021, Cryptomining detection in container clouds using system calls and explainable machine learning, IEEE Trans. Parallel Distrib. Syst., 32, 674, 10.1109/TPDS.2020.3029088 Kaspersky. Mobile security: Android vs. iOS - which one is safer?https://www.kaspersky.com/resource-center/threats/android-vs-iphone-mobile-security; 2020. Kinkead, 2021, Towards explainable CNNs for android malware detection, Procedia Comput. Sci., 184, 959, 10.1016/j.procs.2021.03.118 Lei, 2019, Evedroid: event-aware android malware detection against model degrading for IoT devices, IEEE Internet Things J., 6, 6668, 10.1109/JIOT.2019.2909745 Lin, 2013, Identifying android malicious repackaged applications by thread-grained system call sequences, Comput. Secur., 39, 340, 10.1016/j.cose.2013.08.010 Lindorfer, 2015, MARVIN: efficient and comprehensive mobile app classification through static and dynamic analysis, vol. 2, 422 Liu, 2020, A review of android malware detection approaches based on machine learning, IEEE Access, 8, 124579, 10.1109/ACCESS.2020.3006143 Lu, 2014, Concept drift detection via competence models, Artif. Intell., 209, 11, 10.1016/j.artint.2014.01.001 U. du Luxembourg. Androzoo - lists of APKs. https://androzoo.uni.lu/lists; 2021. 2005, Data Mining and Knowledge Discovery Handbook. A Complete Guide for Practitioners and Researchers Margara A., Rabl T.. Definition of Data Streams; Cham: Springer International Publishing. p. 1–4. doi:10.1007/978-3-319-63962-8_188-1. Molnar C., König G., Herbinger J., Freiesleben T., Dandl S., Scholbeck C.A., Casalicchio G., Grosse-Wentrup M., Bischl B.. Pitfalls to avoid when interpreting machine learning models. arXiv preprint arXiv:200704131 2020. Mutz, 2006, Anomalous system call detection, ACM Trans. Inf. Syst. Secur., 9, 61, 10.1145/1127345.1127348 Narayanan, 2016, Adaptive and scalable android malware detection through online learning, 2484 Naval, 2015, Employing program semantics for malware detection, IEEE Trans. Inf. Forensics Secur., 10, 2591, 10.1109/TIFS.2015.2469253 Onwuzurike, 2019, Mamadroid: detecting android malware by building Markov chains of behavioral models (extended version), ACM Trans. Privacy Secur. (TOPS), 22, 1, 10.1145/3313391 Palmer D.. Sophisticated android malware spies on smartphones users and runs up their phone bill too. https://www.zdnet.com/article/sophisticated-android-malware-spies-on-smartphones-users-and-runs-up-their-phone-bill-too/; 2018. Pendlebury, 2019, {TESSERACT}: eliminating experimental bias in malware classification across space and time, 729 Ramírez-Gallego, 2017, A survey on data preprocessing for data stream mining: current status and future directions, Neurocomputing, 239, 39, 10.1016/j.neucom.2017.01.078 Rudin, 2019, Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead, Nat. Mach. Intell., 1, 206, 10.1038/s42256-019-0048-x Ruiz-Heras, 2017, ADroid: Anomaly-based detection of malicious events in android platforms, Int. J. Inf. Secur., 16, 371, 10.1007/s10207-016-0333-1 Samsung. About knox. https://www.samsungknox.com/en/about-knox; 2021. Saracino, 2018, Madam: effective and efficient behavior-based android malware detection and prevention, IEEE Trans. Dependable Secure Comput., 15, 83, 10.1109/TDSC.2016.2536605 Sasidharan, 2021, ProDroid—An android malware detection framework based on profile hidden Markov model, Pervasive Mob. Comput., 72, 101336, 10.1016/j.pmcj.2021.101336 Scalas, 2019, On the effectiveness of system API-related information for android ransomware detection, Comput. Secur., 86, 168, 10.1016/j.cose.2019.06.004 Seiffert, 2010, RUSBoost: a hybrid approach to alleviating class imbalance, IEEE Trans. Syst., Man, Cybern. Part A, 40, 185, 10.1109/TSMCA.2009.2029559 Sharma, 2021, Malicious application detection in android—A systematic literature review, Comput. Sci. Rev., 40, 100373, 10.1016/j.cosrev.2021.100373 Sophos. Malware goes mobile: Timeline of mobile threats, 2004–2016. https://www.sophos.com/en-us/medialibrary/PDFs/marketing%20material/sophos-threat-infographic-ten-years-malware-mobile-devices.pdf; 2017. Stachl C., Au Q., Schoedel R., Buschek D., Völkel S., Schuwerk T., Oldemeier M., Ullmann T., Hussmann H., Bischl B., et al. Behavioral patterns in smartphone usage predict big five personality traits2019;. Statista. Mobile operating system market share worldwide, July 2020–July 2021. https://gs.statcounter.com/os-market-share/mobile/worldwide; 2021. Surendran, 2020, Gsdroid: graph signal based compact feature representation for android malware detection, Expert Syst. Appl., 159, 113581, 10.1016/j.eswa.2020.113581 Vidal, 2017, Malware detection in mobile devices by analyzing sequences of system calls, World Acad. Sci., Eng.Technol., Int. J. Comput., Electr., Autom., Control Inf. Eng., 11, 594 Vinod, 2019, A machine learning based approach to detect malicious android apps using discriminant system calls, Future Gener. Comput. Syst., 94, 333, 10.1016/j.future.2018.11.021 VirusTotal. An update from virustotal. https://blog.virustotal.com/2012/09/an-update-from-virustotal.html; 2012. Wang, 2021, Android malware detection through machine learning on kernel task structures, Neurocomputing, 435, 126, 10.1016/j.neucom.2020.12.088 Wei, 2017, Deep ground truth analysis of current android malware, 252 Wei, 2022, EPMDroid: efficient and privacy-preserving malware detection based on SGX through data fusion, Inf. Fusion, 10.1016/j.inffus.2021.12.006 Whitwam R.. Android antivirus apps are useless - here’s what to do instead. https://www.extremetech.com/computing/104827-android-antivirus-apps-are-useless-heres-what-to-do-instead; 2021. Xiao, 2019, Android malware detection based on system call sequences and LSTM, Multimed. Tools Appl., 78, 3979, 10.1007/s11042-017-5104-0 Xu, 2019, Droidevolver: self-evolving android malware detection system, 47 Yaswant A.. New advanced android malware posing as “system update”. https://blog.zimperium.com/new-advanced-android-malware-posing-as-system-update/; 2021. Zhang, 2021, Hybrid sequence-based android malware detection using natural language processing, Int. J. Intell. Syst., 36, 5770, 10.1002/int.22529 Zhang, 2020, Enhancing state-of-the-art classifiers with API semantics to detect evolved android malware, 757 Zhao, 2020, Modelling and interpreting pre-evacuation decision-making using machine learning, Autom. Constr., 113, 103140, 10.1016/j.autcon.2020.103140 Zhou, 2012, Dissecting android malware: characterization and evolution, 95 Zyblewski, 2021, Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams, Inf. Fusion, 66, 138, 10.1016/j.inffus.2020.09.004