Empirical study of android repackaged applications

Empirical Software Engineering - Tập 24 - Trang 3587-3629 - 2019
Kobra Khanmohammadi1, Neda Ebrahimi1, Abdelwahab Hamou-Lhadj1, Raphaël Khoury2
1Department of Electrical and Computer Engineering, Concordia University, QC, Canada
2Department of Computer Science and Mathematics, Université du Québec à Chicoutimi, QC, Canada

Tóm tắt

The growing popularity of Android applications has generated increased concerns over the danger of piracy and the spread of malware, and particularly of adware: malware that seeks to present unwanted advertisements to the user. A popular way to distribute malware in the mobile world is through repackaging of legitimate apps. This process consists of downloading, unpacking, manipulating, recompiling an application, and publishing it again in an app store. In this paper, we conduct an empirical study of over 15,000 apps to gain insights into the factors that drive the spread of repackaged apps. We also examine the motivations of developers who publish repackaged apps and those of users who download them, as well as the factors that determine which apps are chosen for repackaging, and the ways in which the apps are modified during the repackaging process. Having observed that adware is particularly prevalent in repackaged apps, we focus on this type of malware and examine how the app is modified when it is injected in an app’s code. Our findings shed much needed light on this class of malware that can be useful to security experts, and allow us to make recommendations that could lead to the creation of more effective malware detection tools, Furthermore, on the basis of our results, we propose a novel app indexing scheme that minimizes the number of comparisons needed to detect repackaged apps.

Tài liệu tham khảo

Aafer Y, Du W, Yin H (2013) DroidAPIMiner: mining API-level features for robust malware detection in android. In: International conference on security and privacy in communication systems. Springer, Cham, pp 86–103 AdMob and AdSense policies (2019) https://support.google.com/admob/answer/6128543?hl=en. Accessed 25 Feb 2019a Alazab M, Venkataraman S, Watters P (2010) Towards Understanding Malware Behaviour by the Extraction of API Calls. In: 2010 Second cybercrime and trustworthy computing workshop. Ieee, pp 52–59 Aldini A, Martinelli F, Saracino A, Sgandurra D (2015) Detection of repackaged mobile applications through a collaborative approach. Wiley Concurr Comput Pract Exp 27(11):2818–2838. https://doi.org/10.1002/cpe.3447 Android Developer Documentation (2018). https://developer.android.com/reference/dalvik/system/package-summary Accessed 3 Mar 2018 Arp D, Spreitzenbarth M, Malte H, Gascon H, Rieck K (2014) DREBIN : effective and explainable detection of android malware in your pocket. In: NDSS (Vol. 14). pp 23–26 Au YWK, Zhou YF, Huang Z, Lie D (2012) PScout : analyzing the android permission specification. In: CCS ‘12 proceedings of the 2012 ACM conference on computer and communications security. pp 217–228 Backes M, Bugiel S, Derr E (2016) Reliable Third-Party Library Detection in Android and its Security Applications. In: the 2016 ACM SIGSAC conference on computer and 2Communications security. ACM, pp 356–367 Bartel A, Klein J, Monperrus M, Traon Y Le (2012) Dexpler: converting android Dalvik bytecode to Jimple for static analysis with soot. In: ACM SIGPLAN International Workshop on State of the Art in Java Program analysis. ACM, pp 27–38 Book T, Pridgen A, Wallach DS (2013) Longitudinal analysis of android ad library permissions. IEEE Mob Secur Technol, ArXiv 1303:0857 Canfora G, Mercaldo F, Visaggio CA (2013) A classifier of malicious android applications. In: Eighth international conference on availability, Reliability and Security (ARES). IEEE, pp 607–614 Chen K, Liu P, Zhang Y (2014) Achieving accuracy and scalability simultaneously in detecting application clones on android markets. In: 36th international conference on software engineering - ICSE 2014. Pp 175–186 Chen J, Alalfi MH, Dean TR, Zou Y (2015a) Detecting android malware using clone detection. J Comput Sci Technol 30(5):942–956. https://doi.org/10.1007/s11390-015-1573-7 Chen K, Wang P, Lee Y, Wang X, Zhang N, Huang H, Zou W, Liu P (2015b) Finding unknown malice in 10 seconds: mass vetting for new threats at the Google-play scale. In: 24th USENIX security symposium (USENIX security 15). pp 659–674 Chien E (2005) Techniques of adware and spyware. In: the proceedings of the fifteenth virus bulletin conference (Vol. 47). Dublin Ireland Crussell J, Gibler C, Chen H (2012) Attack of the clones: detecting cloned applications on android markets. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 37–54 Crussell J, Stevens R, Chen H (2014) MadFraud : investigating ad fraud in android applications. In: the 12th annual international conference on Mobile systems, applications, and services. ACM, pp 123–134 Crussell J, Gibler C, Chen H (2015) AnDarwin: scalable detection of android application clones based on semantics. IEEE Trans Mob Comput 14(10):2007–2019. https://doi.org/10.1109/TMC.2014.2381212 Desnos A (2015) Androguard: Reverse engineering, Malware and goodware analysis of Android applications ... and more (ninja !). https://github.com/androguard/androguard. Accessed 19 Jul 2018 Dong F, Wang H, Li L, Guo Y, Bissyande TF, Liu T, Xu G, Klein J (2018a) FraudDroid : automated ad fraud detection for android apps. ArXiv: 1709.01213v4 Dong S, Li M, Diao W, Liu X, Liu J, Li Z, Xu F, Chen K, Wang X, Zhang K (2018b) Understanding android obfuscation techniques : a large-scale investigation in the wild. ArXiv: 801.01633v1 Enck W, Cox LP, Gilbert P, Mcdaniel P (2014) TaintDroid : an information-flow tracking system for Realtime privacy monitoring on smartphones. ACM Trans Comput Syst 32(2):5 Erturk E (2012) A case study in open source software security and privacy : Android Adware. In: In 2012 World congress on Internet security (WorldCIS). IEEE, pp 189–191 Gao J, Li L, Tegawend PK (2019) Should you consider adware as malware in your study ? In: 26th international conference on software analysis, Evolution and Reengineering (SANER). IEEE, pp 604–608 Gascon H, Yamaguchi F, Rieck K, Arp D (2013) Structural detection of android malware using embedded call graphs categories and subject descriptors. In: ACM workshop on Artificial intelligence and Security. ACM, pp 45–54 Gonzalez H, Kadir AA, Stakhanova N, Alzahrani AJ, Ghorbani AA (2014) Exploring reverse engineering symptoms in android apps. In: the eighth European workshop on system security. ACM, p 7 Google Inc. (2012) Cloud to device messaging (Deprecated). https://developers.google.com/android/c2dm/. Accessed 23 Jul 2018 Grace M, Zhou Y, Zhang Q, Zou S, Jiang X (2012) RiskRanker: scalable and accurate zero-day android malware detection. In: 10th international conference on mobile systems, applications, and services. pp 281–294 Guan Q, Huang H, Luo W, Zhu S (2016) Semantics-based repackaging detection for mobile apps. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 89–105 Gupta S (2013) Types of malware and its analysis. Int J Sci Eng Res 4(1) Hanna S, Huang L, Wu E, Li S, Chen C, Song D (2012) Juxtapp: A scalable system for detecting code reuse among android applications. In: International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, pp. 62-81 Hu W, Tao J, Ma X, Zhou W, Zhao S, Han T (2014) MIGDroid: detecting APP-repackaging android malware via method invocation graph. In: Proceedings - International Conference on Computer Communications and Networks, ICCCN. pp 1–7 Huang A (2008) Similarity measures for text document clustering. In: the sixth New Zealand computer science research student conference. pp 49–56 Hurier M, Suarez-Tangil G, Dash SK, Bissyande TF, Le Traon Y, Klein J, Cavallaro L (2017) Euphony: harmonious unification of cacophonous anti-virus vendor labels for android malware. IEEE International Working Conference on Mining Software Repositories, In, pp 425–435 Islam R, Altas I (2012) A comparative study of malware family classification. pp 488–496 Jiao S, Cheng Y, Ying L, Su P, Feng D (2015) A rapid and scalable method for android application repackaging detection. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 349–364 Khanmohammadi K, Hamou-Lhadj A (2017) HyDroid: A Hybrid Approach for Generating API Call Traces from Obfuscated Android Applications for Mobile Security. In: the 2017 IEEE international conference on software quality, reliability and security (QRS), Prague, Czech Republic, p. 168–175 Khanmohammadi K, Rejali M, Hamou-Lhadj A (2015) Understanding the service life cycle of android apps: an exploratory study. In: the 5th annual ACM CCS workshop on security and privacy in smartphones and Mobile devices (SPSM), Denver, US Kornblum J (2006) Identifying almost identical files using context triggered piecewise hashing. Digital Investigation, In, pp 91–97 Kumar M (2017) Beware! New android malware infected 2 Million Google Play Store Users. https://thehackernews.com/2017/04/android-malware-playstore.html. Accessed 19 Jul 2018 Kywe SM, Li Y, Deng RH, Hong J (2014) Detecting camouflaged applications on mobile application markets. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 241–254 Lee YK, Lim JD, Jeon YS, Kim JN (2014) Protection method from APP repackaging attack on mobile device with separated domain. In: International Conference on ICT Convergence. pp 667–668 Leka O (2016) Database of android apps | Kaggle. https://www.kaggle.com/orgesleka/android-apps/data. Accessed 19 Jul 2018 Li Y, Sundaramurthy SC, Bardas AG, et al (2015) Experimental study of fuzzy hashing in malware clustering analysis. In: 8th workshop on cyber security experimentation and test (CSET 15) Li L, Gao J, Hurier M, Kong P, Bissyandé TF, Bartel A, Klein J, Traon, Y Le (2017a) AndroZoo++: collecting millions of android apps and their metadata for the research community. https://doi.org/10.1145/2901739.2903508 Li L, Li D, Bissyande TF, Klein J, Le Traon Y, Lo D, Cavallaro L (2017b) Understanding android app piggybacking: a systematic study of malicious code grafting. IEEE Transactions on Information Forensics and Security. IEEE, In, pp 359–361 Li L, Bissyandé TF, Klein J (2018a) MoonlightBox: mining android API histories for uncovering release-time inconsistencies. In: the 29th IEEE international symposium on software reliability engineering (ISSRE 2018). IEEE Li Li, Tegawendé Bissyandé, Jacques Klein (2018b) Rebooting research on detecting repackaged android apps: Literature Review and Benchmark, arXiv preprint arXiv:1811.08520 Lin YD, Lai YC, Chen CH, Tsai HC (2013) Identifying android malicious repackaged applications by thread-grained system call sequences. Comput Secur 39(PART B):340–350. https://doi.org/10.1016/j.cose.2013.08.010 Linares-Vásquez M, Holtzhauer A, Bernal-Cárdenas C, Poshyvanyk D (2014) Revisiting android reuse studies in the context of code obfuscation and library usages. In: 11th working conference on mining software repositories - MSR 2014. pp 242–251 Liu B, California S, Nath S, Nsdi I (2014) DECAF : detecting and characterizing ad fraud in Mobile apps this paper is included in the proceedings of the. In: 11th {USENIX} symposium on networked systems design and implementation ({NSDI} 14). pp 57–70 Luo L, Fu Y, Wu D, Zhu S, Liu P (2016) Repackage-proofing android apps. In: in 2016 46th annual IEEE/IFIP international conference on dependable systems and networks (DSN). IEEE, pp 550–561 Ma Z, Wang H, Guo Y, Chen X (2016) LibRadar : fast and accurate detection of third-party libraries in android apps. In: the 38th international conference on software engineering companion. ACM, pp 653–656 Maly F, Kriz P (2015) An Ad Hoc mobile cloud and its dynamic loading of modules into a mobile device running Google android. In: New trends in intelligent information and database systems. Springer, Cham, pp 191–198 Mariconti E, Onwuzurike L, Andriotis P, De Cristofaro E, Ross G, Stringhini G (2017) MaMaDroid: detecting android malware by building Markov chains of behavioral models. In: 24th network and distributed system security symposium Microsoft Advertising (2019) https://advertising.microsoft.com/home. Accessed 25 Feb 2019b Mojica IJ, Nagappan M, Adams B, Hassan AE (2012) Understanding reuse in the android market. In: Program Comprehension (ICPC), 2012 IEEE 20th International Conference on. pp 113–122 Mojica IJ, Adams B, Nagappan M, Dienst S, Berger T, Hassan AE (2014) A large scale empirical study on software reuse in mobile apps. Software, IEEE 31(2):78–86. https://doi.org/10.1109/MS.2013.142 Mulliner C, Robertson W, Kirda E (2014) VirtualSwindle : an automated attack against in-app Billing on android. In: Proceedings of the 9th ACM symposium on Information, computer and communications security. ACM, pp 459–470 OWASP (2016) Mobile Top 10 2016-Top 10 - OWASP. https://www.owasp.org/index.php/Mobile_Top_10_2016-Top_10. Accessed 19 Jul 2018 Potharaju R, Newell A, Nita-rotaru C, Zhang X (2012) Plagiarizing smartphone applications : attack strategies and defense techniques. In: In International Symposium on Engineering Secure Software and Systems. Springer, Berlin, Heidelberg, pp 106–120 Rastogi V, Shao R, Chen Y et al (2016) Are these ads safe : detecting hidden attacks through the mobile app-web interfaces. NDSS, In Ren C, Chen K, Liu P (2014) Droidmarking: resilient softwarewatermarking for impeding android application repackaging. In: 29th ACM/IEEE international conference on Automated software engineering. pp 635–646 Sahs J, Khan L (2012) A machine learning approach to android malware detection. European Intelligence and Security Informatics Conference, In, pp 141–147 Sanz B, Santos I, Laorden C, Ugarte-Pedrero X, Bringas PG (2012) On the automatic categorisation of android applications. In: IEEE consumer communications and networking conference, CCNC’2012. pp 149–153 Shahriar H, Clincy V (2014) Detection of repackaged android malware. In: 9th international conference for Internet technology and secured transactions. ICITST 2014:349–354 Shao Y, Luo X, Qian C, Zhu P, Zhang L (2014) Towards a scalable resource-driven approach for detecting repackaged android applications. In: ACSAC ‘14 (30th annual computer security applications conference). pp 56–65 Sharif M, Lanzi A, Giffin J, Lee W (2008) Impeding malware analysis using conditional code obfuscation. In: Network and Distributed System Security Symposium, NDSS 2008 Singhal A (2001) Modern information retrieval: A brief overview. IEEE Data Eng. Bull., 24(4), pp.35-43 Soh C, Tan HBK, Arnatovich YL, Wang L (2015) Detecting clones in android applications through analyzing user interfaces. In: 23rd IEEE international conference on program comprehension. IEEE, pp 163–173 Suarez-Tangil G, Tapiador JE, Peris-Lopez P, Blasco J (2014) Dendroid: a text mining approach to analyzing and classifying code structures in android malware families. Expert Syst Appl 41(4):1104–1117. https://doi.org/10.1016/j.eswa.2013.07.106 Sun X, Zhongyang Y, Xin Z, Mao B, Xie L (2014) Detecting code reuse in android applications using component-based control flow graph. In: International Information Security and Privacy Conference. pp 142–155 Sun M, Li M, Lui JCS (2015) DroidEagle: seamless detection of visually similar android apps. In: 8th ACM conference on Security & Privacy in wireless and mobile networks. ACM, p 9 Symantec (2014) Android.Appenda. https://www.symantec.com/security-center/writeup/2012-062812-0516-99. Accessed 4 Mar 2019 Takabi H, Joshi JBD, Ahn GJ (2010) SecureCloud: towards a comprehensive security framework for cloud computing environments. Proc - Int Comput Softw Appl Conf :393–398. https://doi.org/10.1109/COMPSACW.2010.74 Tian K, Yao D, Ryder BG, Tan G (2016) Analysis of code heterogeneity for high-precision classification of repackaged malware. In: IEEE Symposium on Security and Privacy Workshops, SPW 2016. pp 262–271 Viennot N, Garcia E, Nieh J (2014) A measurement study of google play. Meas Model Comput Syst - SIGMETRICS 42(1):221–233. https://doi.org/10.1145/2591971.2592003 Vigna G, Kruegel C, Bianchi A, Poeplau S, Fratantonio Y (2014) Execute this! Analyzing unsafe and malicious dynamic code loading in android applications. In: NDSS (Vol. 14). pp 23–49 VirusTotal (2018) Free online virus malware and URL scanner. Google Inc., In https://www.virustotal.com/#/home/upload. Accessed 19 Jul 2018 Wang H, Guo Y, Ma Z, Chen X (2015) WuKong: a scalable and accurate two-phase approach to android app clone detection. International Symposium on Software Testing and Analysis - ISSTA 2015:71–82 Winter C, Schneider M, Yannikos Y (2013) F2S2: fast forensic similarity search through indexing piecewise hash signatures. Digit Investig 10(4):361–371. https://doi.org/10.1016/j.diin.2013.08.003 Wiśniewski R (2012) Apktool - A tool for reverse engineering 3rd party, closed, binary Android apps. https://ibotpeaches.github.io/Apktool/. Accessed 19 Jul 2018 Wu DJ, Mao CH, Wei TE, Lee HM, Wu KP (2012) DroidMat: android malware detection through manifest and API calls tracing. In: Proceedings of the 2012 7th Asia joint conference on information security, AsiaJCIS 2012. pp 62–69 Wu X, Zhang D, Su X, Li W (2015) Detect repackaged android application based on HTTP traffic similarity. Secur Commun Networks 8(13):2257–2266. https://doi.org/10.1002/sec.1170 Xu K, Li Y, Deng RH (2016) ICCDetector: ICC-based malware detection on android. IEEE Trans Inf Forensics Secur 11(6):1252–1264. https://doi.org/10.1109/TIFS.2016.2523912 Xue Y, Meng G, Liu Y, Tan TH, Chen H, Sun J, Zhang J (2017) Auditing anti-malware tools by evolving android malware and dynamic loading technique. IEEE Trans Inf Forensics Secur 12(7):1529–1544. https://doi.org/10.1109/TIFS.2017.2661723 Yang C, Xu Z, Gu G, Yegneswaran V, Porras P (2014) DroidMiner : automated mining and characterization of fine-grained malicious behaviors in android. European Symposium on Research in Computer Security, In, pp 163–182 Yue S, Feng W, Jiang Y, Tao X, Xu C, Lu J (2017) RepDroid: an automated tool for android application repackaging detection. In: in (ICPC), 2017 IEEE/ACM 25th International Conference on Program Comprehension. IEEE, pp 132–142 Zeng Q, Luo L, Qian Z, Du X, Li Z (2018) Resilient decentralized android application repackaging detection using logic bombs. In: In Proceedings of the 2018 International Symposium on Code Generation and Optimization. ACM, pp 50–61 Zhang F, Huang H, Zhu S, Wu D, Liu P (2014) ViewDroid: towards obfuscation-resilient mobile application repackaging detection. WiSec 2014 - Proc 7th ACM Conf Secur Priv Wirel Mob Networks :25–36 . https://doi.org/10.1145/2627393.2627395 Zhao Y, Qian Q (2018) Android malware identification through visual exploration of disassembly files. Int J Netw Secur 20(6):1005–1015. https://doi.org/10.6633/IJNS.201811 Zhauniarovich Y, Gadyatskaya O, Crispo B, La Spina F, Moser E (2014) FSquaDRA: fast detection of repackaged applications. In: lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics). pp 130–145 Zhou Y, Jiang X (2012) Dissecting android malware: characterization and evolution. In: 2012 IEEE symposium on security and privacy. IEEE, pp 95–109 Zhou W, Zhou Y, Jiang X, Ning P, Drive O (2012a) Detecting repackaged smartphone applications in third-party android marketplaces. In Proceedings of the second ACM conference on Data and Application Security and Privacy. ACM, In, pp 317–326 Zhou Y, Wang Z, Zhou W, Jiang X (2012b) Hey, you, get off of my market: detecting malicious apps in official and alternative android markets. 19th Annu Netw Distrib Syst Secur Symp 25(4):50–52 Zhou W, Zhang X, Jiang X (2013a) AppInk : watermarking android apps for repackaging deterrence. In: the 8th ACM SIGSAC symposium on information, computer and communications security. pp 1–12 Zhou W, Zhou Y, Grace M, Jiang X, Zou S (2013b) Fast , scalable detection of “ Piggybacked ” mobile applications. In: In Proceedings of the third ACM conference on Data and application security and privacy. ACM, pp 185–196 Zhou W, Wang Z, Zhou Y, Jiang X (2014) DIVILAR: diversifying intermediate language for anti-repackaging on android platform. In: CODASPY ‘14 (4rd ACM conference on data and application security and Privac). pp 199–210