Empirical assessment of machine learning-based malware detectors for Android

Kevin Allix1, Tégawendé F. Bissyandé1, Quentin Jérôme1, Jacques Klein1, Radu State1, Yves Le Traon1
1Interdisciplinary Center for Security, Reliability and Trust, University of Luxembourg, Luxembourg, Luxembourg

Tóm tắt

Từ khóa


Tài liệu tham khảo

Allix K, Bissyandé TF, Jérome Q, Klein J, State R, Le Traon Y (2014a) Large-scale machine learning-based malware detection: confronting the “10-fold cross validation” scheme with reality. In: Proceedings of the 4th ACM conference on data and application security and privacy. ACM, New York, CODASPY ’14, pp 163–166. doi: 10.1145/2557547.2557587

Allix K, Jérome Q, Bissyandé TF, Klein J, State R, Le Traon Y (2014b) A forensic analysis of android malware: how is malware written and how it could be detected? In: Computer software and applications conference (COMPSAC)

Amos B, Turner H, White J (2013) Applying machine learning classifiers to dynamic android malware detection at scale. In: 2013 9th international wireless communications and mobile computing conference (IWCMC), pp 1666–1671. doi: 10.1109/IWCMC.2013.6583806

AndroGuard (2013) Apktool for reverse engineering android applications. https://code.google.com/p/androguard/ . Accessed 09 Sep 2013

AppBrain (2013a) Comparison of free and paid android apps. http://www.appbrain.com/stats/free-and-paid-android-applications . Accessed 09 Sep 2013

AppBrain (2013b) Number of available android applications. http://www.appbrain.com/stats/number-of-android-apps . Accessed 09 Sep 2013

Breiman L (2001) Random forests. Mach Learn 45(1):5–32

Canfora G, Mercaldo F, Visaggio CA (2013) A classifier of malicious android applications. In: 2013 eight international conference on availability, reliability and security (ARES)

Cesare S, Xiang Y (2010) Classification of malware using structured control flow. In: Proceedings of the eighth Australasian symposium on parallel and distributed computing, vol 107. Australian Computer Society, Inc., Darlinghurst, Australia, AusPDC ’10, pp 61–70

Cohen WW (1995) Fast effective rule induction. In: Machine learning-international workshop then conference. Morgan Kaufmann Publishers, Inc., pp 115–123

Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297. doi: 10.1007/BF00994018

Demme J, Maycock M, Schmitz J, Tang A, Waksman A, Sethumadhavan S, Stolfo S (2013) On the feasibility of online malware detection with performance counters. In: Proceedings of the 40th annual international symposium on computer architecture. ACM, New York, ISCA ’13, pp 559–570. doi: 10.1145/2485922.2485970

Enck W, Octeau D, McDaniel P, Chaudhuri S (2011) A study of android application security. In: Proceedings of the 20th USENIX conference on security. USENIX Association, Berkeley, SEC’11, pp 21–21. http://dl.acm.org/citation.cfm?id=2028067.2028088

Felt AP, Finifter M, Chin E, Hanna S, Wagner D (2011) A survey of mobile malware in the wild. In: Proceedings of the 1st ACM workshop on security and privacy in smartphones and mobile devices. ACM, New York, SPSM ’11, pp 3–14. doi: 10.1145/2046614.2046618

Google (2012) Android and security (bouncer announcement). http://googlemobile.blogspot.fr/2012/02/android-and-security.html . Accessed 14 June 2014

Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Explor Newsl 11(1):10–18. doi: 10.1145/1656274.1656278

He H, Garcia E (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284. doi: 10.1109/TKDE.2008.239

Henchiri O, Japkowicz N (2006) A feature selection and evaluation scheme for computer virus detection. In: Proceedings of the sixth international conference on data mining. IEEE Computer Society, Washington, DC, ICDM ’06, pp 891–895. doi: 10.1109/ICDM.2006.4

Jacob A, Gokhale M (2007) Language classification using n-grams accelerated by fpga-based bloom filters. In: Proceedings of the 1st international workshop on high-performance reconfigurable computing technology and applications: held in conjunction with SC07. Reno, Nevada, HPRCTA ’07, pp 31–37

Kephart JO (1994) A biologically inspired immune system for computers. In: Artificial life IV: proceedings of the fourth international workshop on the synthesis and simulation of living systems. MIT Press, pp 130–139

Kolter JZ, Maloof MA (2006) Learning to detect and classify malicious executables in the wild. J Mach Learn Res 7:2721–2744. http://dl.acm.org/citation.cfm?id=1248547.1248646

McLachlan G, Do KA, Ambroise C (2005) Analyzing microarray gene expression data, vol 422. Wiley.com

Perdisci R, Lanzi A, Lee W (2008a) Classification of packed executables for accurate computer virus detection. Pattern Recogn Lett 29(14):1941–1946. http://www.sciencedirect.com/science/article/pii/S0167865508002110

Perdisci R, Lanzi A, Lee W (2008b) Mcboost: boosting scalability in malware collection and analysis using statistical classification of executables. In: Computer security applications conference, 2008. ACSAC 2008. Annual, pp 301–310. doi: 10.1109/ACSAC.2008.22

Pieterse H, Olivier M (2012) Android botnets on the rise: trends and characteristics. In: Information security for South Africa (ISSA), 2012, pp 1–5. doi: 10.1109/ISSA.2012.6320432

Pouik G (2012) Similarities for fun & profit. Phrack 14(68). http://www.phrack.org/issues.html?id=15&issue=68

Quinlan JR (1993) C4.5: programs for machine learning, vol 1. Morgan Kaufmann

Rossow C, Dietrich C, Grier C, Kreibich C, Paxson V, Pohlmann N, Bos H, van Steen M (2012) Prudent practices for designing malware experiments: status quo and outlook. In: 2012 IEEE symposium on security and privacy (SP), pp 65–79. doi: 10.1109/SP.2012.14

Sahs J, Khan L (2012) A machine learning approach to android malware detection. In: 2012 European intelligence and security informatics conference (EISIC). IEEE, pp 141–147. doi: 10.1109/EISIC.2012.34

Santos I, Penya YK, Devesa J, Bringas PG (2009) N-grams-based file signatures for malware detection. In: ICEIS, pp 317–320

Schultz M, Eskin E, Zadok E, Stolfo S (2001) Data mining methods for detection of new malicious executables. In: Proceedings 2001 IEEE symposium on security and privacy, 2001. S P 2001, pp 38–49. doi: 10.1109/SECPRI.2001.924286

Tahan G, Rokach L, Shahar Y (2012) Mal-id: automatic malware detection using common segment analysis and meta-features. J Mach Learn Res 98888:949–979

Van Hulse J, Khoshgoftaar TM, Napolitano A (2007) Experimental perspectives on learning from imbalanced data. In: Proceedings of the 24th international conference on machine learning. ACM, New York, ICML ’07, pp 935–942. doi: 10.1145/1273496.1273614

Wu DJ, Mao CH, Wei TE, Lee HM, Wu KP (2012) Droidmat: Android malware detection through manifest and api calls tracing. In: 2012 seventh Asia joint conference on information security (Asia JCIS), pp 62–69. doi: 10.1109/AsiaJCIS.2012.18

Yerima S, Sezer S, McWilliams G, Muttik I (2013) A new android malware detection approach using bayesian classification. In: 2013 IEEE 27th international conference on advanced information networking and applications (AINA), pp 121–128. doi: 10.1109/AINA.2013.88

Zhang B, Yin J, Hao J, Zhang D, Wang S (2007) Malicious codes detection based on ensemble learning. In: Proceedings of the 4th international conference on autonomic and trusted computing. Springer, Berlin, Heidelberg, ATC’07, pp 468–477

Zhou Y, Jiang X (2012) Dissecting android malware: characterization and evolution. In: Proceedings of the 2012 IEEE symposium on security and privacy. IEEE Computer Society, Washington, DC, SP ’12, pp 95–109. doi: 10.1109/SP.2012.16