A hierarchical model for quantifying software security based on static analysis alerts and software metrics

Software Quality Journal - Tập 29 - Trang 431-507 - 2021
Miltiadis Siavvas1,2, Dionysios Kehagias2, Dimitrios Tzovaras2, Erol Gelenbe1,3
1Imperial College London, London, UK
2Centre for Research and Technology Hellas, Thessaloniki, Greece
3Institute of Theoretical & Applied Informatics, Polish Academy of Sciences, Gliwice, Poland

Tóm tắt

Despite the acknowledged importance of quantitative security assessment in secure software development, current literature still lacks an efficient model for measuring internal software security risk. To this end, in this paper, we introduce a hierarchical security assessment model (SAM), able to assess the internal security level of software products based on low-level indicators, i.e., security-relevant static analysis alerts and software metrics. The model, following the guidelines of ISO/IEC 25010, and based on a set of thresholds and weights, systematically aggregates these low-level indicators in order to produce a high-level security score that reflects the internal security level of the analyzed software. The proposed model is practical, since it is fully automated and operationalized in the form of a standalone tool and as part of a broader Computer-Aided Software Engineering (CASE) platform. In order to enhance its reliability, the thresholds of the model were calibrated based on a repository of 100 popular software applications retrieved from Maven Repository. Furthermore, its weights were elicited in a way to chiefly reflect the knowledge expressed by the Common Weakness Enumeration (CWE), through a novel weights elicitation approach grounded on popular decision-making techniques. The proposed model was evaluated on a large repository of 150 open-source software applications retrieved from GitHub and 1200 classes retrieved from the OWASP Benchmark. The results of the experiments revealed the capacity of the proposed model to reliably assess internal security at both product level and class level of granularity, with sufficient discretion power. They also provide preliminary evidence for the ability of the model to be used as the basis for vulnerability prediction. To the best of our knowledge, this is the first fully automated, operationalized and sufficiently evaluated security assessment model in the modern literature.

Tài liệu tham khảo

Abdulrazeg, A. A., Norwawi, N. M., & Basir, N. (2012). Security metrics to improve misuse case model. 2012 International Conference on Cyber Security, Cyber Warfare and Digital Forensic, pages 94–99. Alhazmi, O. H., Malaiya, Y. K., & Ray, I. (2007). Measuring, analyzing and predicting security vulnerabilities in software systems. Computers and Security, 26(3), 219–228. Alshammari, B., Fidge, C., & Corney, D. (2010). Security Metrics for Object-Oriented Designs. In 2010 21st Australian Software Engineering Conference, pages 55–64. Alshammari, B., Fidge, C., & Corney, D. (2011). A Hierarchical Security Assessment Model for Object-Oriented Programs. 2011 11th International Conference on Quality Software, pages 218–227. Alshammari, B., Fidgeand, C., & Corney, D. (2009). Security metrics for object-oriented class designs. Proceedings - International Conference on Quality Software, 11–20. Andress, J. (2014). The basics of information security : understanding the fundamentals of InfoSec in theory and practice. Waltham, MA: Syngress. Ansar, S. A., Alka, & Khan, R. A. (2018). A phase-wise review of software security metrics. In Networking Communication and Data Knowledge Engineering. Baggen, R., Correia, J. P., Schill, K., & Visser, J. (2012). Standardized code quality benchmarking for improving software maintainability. Software Quality Journal, 20(2), 287–307. Bagnara, R., Bagnara, A., & Hill, P. M. (2018). The misra c coding standard and its role in the development and analysis of safety-and security-critical embedded software. In International Static Analysis Symposium, pages 5–23. Springer. Bakota, T., Hegedűs, P., Körtvélyesi, P., Ferenc, R., & Gyimóthy, T. (2011). A probabilistic software quality model. In 2011 27th IEEE International Conference on Software Maintenance (ICSM), pages 243–252. IEEE. Bansiya, J., & Davis, C. (2002). A hierarchical model for object-oriented design quality assessment. IEEE Transactions on Software Engineering, 28(1), 4–17. Basso, T., Silva, H., & Moraes, R. (2019). On the use of quality models to characterize trustworthiness properties. In Software Engineering for Resilient Systems: Springer. Bholanath, R. (2016). Analyzing the State of Static Analysis: A Large-Scale Evaluation in Open Source Software. 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), 1:470–481. Boland, T. & Black, P. E. (2012). Juliet 1.1 C/C++ and java test suite. Computer, 45(10):88–90. Carvalho, M., DeMott, J., Ford, R., & Wheeler, D. A. (2014). Heartbleed 101. IEEE Security Privacy, 12(4), 63–67. Chess, B., & McGraw, G. (2004). Static analysis for security. Security & Privacy, IEEE, 2, 76–79. Chidamber, S. R., & Kemerer, C. F. (1994). A Metrics Suite for Object Oriented Design. IEEE Transactions on Software Engineering, 20(6), 476–493. Chowdhury, I., Chan, B., & Zulkernine, M. (2008). Security metrics for source code structures. Proceedings of the fourth international workshop on Software engineering for secure systems - SESS ’08. Chowdhury, I. & Zulkernine, M. (2010). Can Complexity, Coupling, and Cohesion Metrics Be Used As Early Indicators of Vulnerabilities? In Proceedings of the 2010 ACM Symposium on Applied Comp. Chowdhury, I., & Zulkernine, M. (2011). Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities. Journal of Systems,. Architecture. Cohen, J. (2013). Statistical power analysis for the behavioral sciences. Academic press. Colombo, R. T., Pessôa, M. S., Guerra, A. C., Filho, A. B., & Gomes, C. C. (2012). Prioritization of software security intangible attributes. ACM SIGSOFT Software Engineering Notes, 37(6), 1. Cunningham, W. (1993). The wycash portfolio management system. ACM SIGPLAN OOPS Messenger, 4(2), 29–30. Dam, H. K., Tran, T., Pham, T. T. M., Ng, S. W., Grundy, J., & Ghose, A. (2018). Automatic feature learning for predicting vulnerable software components. IEEE Transactions on Software Engineering,. Dayanandan, U. & Kalimuthu, V. (2018). Software architectural quality assessment model for security analysis using fuzzy analytical hierarchy process (fahp) method. 3D Research, 9(3):31. Deissenboeck, F., Juergens, E., Lochmann, K., & Wagner, S. (2009). Software quality models: Purposes, usage scenarios and requirements. In Proc - International Conference on Software Engineering. DeMarco, T. (1986). Controlling Software Projects: Management, Measurement, and Estimates. Upper Saddle River, NJ, USA: Prentice Hall PTR. di Biase, M., Rastogi, A., Bruntink, M., & van Deursen, A. (2019). The delta maintainability model: Measuring maintainability of fine-grained code changes. In 2019 IEEE/ACM International Conference on Technical Debt (TechDebt). Dromey, R. G. (1995). A model for software product quality. IEEE Transactions on Software Engineering,21(2), 146–162. Edwards, W., & Barron, F. (1994). SMARTS and SMARTER: Improved Simple Methods for Multiattribute Utility Measurement. Organizational Behavior and Human Decision Processes, 60(3), 306–325. Felderer, M., Büchler, M., Johns, M., Brucker, A. D., Breu, R., & Pretschner, A. (2016). Security testing: A survey. In Advances in Computers, volume 101, pages 1–51. Elsevier. Ferenc, R., Hegedűs, P., Gyimesi, P., Antal, G., Bán, D., & Gyimóthy, T. (2019). Challenging machine learning algorithms in predicting vulnerable javascript functions. In Proceedings of the 7th International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering. Goseva-Popstojanova, K., & Perhinschi, A. (2015). On the capability of static code analysis to detect security vulnerabilities. Information and Software Technology, 68, 18–33. Hahn, A., Tamimi, A., & Anderson, D. (2018). Securing your ics software with the attacksurface host analyzer (aha). In Proceedings of the 4th Annual Industrial Control System Security Workshop. Hatzivasilis, G., Papaefstathiou, I., & Manifavas, C. (2016). Software security, privacy, and dependability: Metrics and measurement. IEEE Software,33(4), Heckman, S. & Williams, L. (2009). A model building process for identifying actionable static analysis alerts. In Proceedings - 2nd International Conference on Software Testing, Verification, and Validation, ICST 2009, pages 161–170. Heckman, S. & Williams, L. (2013). A Comparative Evaluation of Static Analysis Actionable Alert Identification Techniques. In Proceedings of the 9th International Conference on Predictive Models in Software Engineering, pages 4:1–4:10. Heitlager, I., Kuipers, T., & Visser, J. (2007). A Practical Model for Measuring Maintainability. 6th International Conference on the Quality of Information and Communications Technology. Hovemeyer, D., & Pugh, W. (2004). Finding bugs is easy. ACM SIGPLAN Notices, 39(12), 92. Howard, M. (2003). Writing secure code. Redmond, Wash: Microsoft Press. Howard, M. & Corporation, M. (2007). Determining Relative Attack Surface. US patent 7299497 B2, Patent and Trademark Office. Howard, M., LeBlanc, D., & Viega, J. (2010). 24 Deadly Sins of Software Security. McGraw-Hill. Howard, M. & Lipner, S. (2006). The Security Development Lifecycle: SDL: A Process for Developing Demonstrably More Secure Software. Microsoft Press. Howard, M., Pincus, J., & Wing, J. M. (2005). Measuring relative attack surfaces. In Computer Security in the 21st Century, pages 109–137. Springer. ISO/IEC. (2011). ISO/IEC 25010 - Systems and software engineering - Systems and software Quality Requirements and Evaluation (SQuaRE) - System and software quality models. ISO/IEC. ISO/IEC. (2013). ISO/IEC 27001:2013(en) Information technology Security techniques Information security management systems Requirements. ISO/IEC. Izurieta, C., & Prouty, M. (2019). Leveraging secdevops to tackle the technical debt associated with cybersecurity attack tactics. In Proc. of the 2nd International Conference on Technical Debt. Izurieta, C., Rice, D., Kimball, K., & Valentien, T. (2018). A position study to investigate technical debt associated with security weaknesses. In 2018 International Conference on Technical Debt. Jankovic, M., Kehagias, D., Siavvas, M., Tsoukalas, D., & Chatzigeorgiou, A. (2019). The SDK4ED Approach to Software Quality Optimization and Interplay Calculation. In 15th China-Europe International Symposium on Software Engineering Education. Jimenez, M., Papadakis, M., & Le Traon, Y. (2016). Vulnerability prediction models: A case study on the linux kernel. In 2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM), pages 1–10. IEEE. Jimenez, M., Rwemalika, R., Papadakis, M., Sarro, F., Le Traon, Y., & Harman, M. (2019). The importance of accounting for real-world labelling when predicting software vulnerabilities. In 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. Jin, C., & Jin, S. W. (2014). Software reliability prediction model based on support vector regression with improved estimation of distribution algorithms. Applied Software Computing, 15, 113–120. Johnson, B., Song, Y., Murphy-Hill, E., & Bowdidge, R. (2013). Why don’t software developers use static analysis tools to find bugs? Proceedings of the 2013 International Conference on Software Engineering, pages 672–681. Kalouptsoglou, I., Siavvas, M., Tsoukalas, D., & Kehagias, D. (2020). Cross-project vulnerability prediction based on software metrics and deep learning. In International Conference on Computational Science and Its Applications, pages 877–893. Springer. Kehagias, D., Jankovic, M., Siavvas, M., & Gelenbe, E. (2021). Investigating the interaction between energy consumption, quality of service, reliability, security, and maintainability of computer systems and networks. SN Computer Science, 2(1), 1–6. Khurshid, S., Shrivastava, A. K., & Iqbal, J. (2019). Effort based software reliability model with fault reduction factor, change point and imperfect debugging. International Journal of Information Technology. Krsul, I. (1998). Software Vulnerability Analysis. PhD thesis, Department of Computer Sciences, Purdue University. Lai, S. T. (2010). An analyzer-based software security measurement model for enhancing software system security. Proceedings - 2010 2nd WRI World Congress on Software Engineering. Li, B., Zhang, Y., Li, J., Yang, W., & Gu, D. (2018). Appspear: Automating the hidden-code extraction and reassembling of packed android malware. Journal of Systems and Software, 140, 3–16. Luszcz, J. (2018). Apache struts 2: how technical and development gaps caused the equifax breach. Network Security, 2018(1), 5–8. Manadhata, P. K., & Wing, J. M. (2011). An attack surface metric. IEEE Transactions on Software Engineering, 37(3), McGraw, G. (2006). Software Security: Building Security In. Addison-Wesley Professional. McGraw, G. (2008). Automated code review tools for security. Computer, 41(12), 108–111. Medeiros, N., Ivaki, N., Costa, P., & Vieira, M. (2017). Software metrics as indicators of security vulnerabilities. In 2017 IEEE 28th International Symposium on Software Reliability Engineering (ISSRE), pages 216–227. IEEE. Medeiros, N., Ivaki, N., Costa, P., & Vieira, M. (2018). An approach for trustworthiness benchmarking using software metrics. In 2018 IEEE 23rd Pacific Rim International Symposium on Dependable Computing (PRDC), pages 84–93. Medeiros, N. P. D. S., Ivaki, N. R., Costa, P. N. D., & Vieira, M. P. A. (2017). Towards an approach for trustworthiness assessment of software as a service. In 2017 IEEE International Conference on Edge Computing (EDGE), pages 220–223. Mohammed, N. M., Niazi, M., Alshayeb, M., & Mahmood, S. (2016). Exploring Software Security Approaches in Software Development Lifecycle: A Systematic Mapping Study. Comp: Stand. & Interf. Morrison, P., Moye, D., Pandita, R., & Williams, L. (2018). Mapping the field of software life cycle security metrics. Information and Software Technology, 102(May), 146–159. Moshtari, S., & Sami, A. (2016). Evaluating and comparing complexity, coupling and a new proposed set of coupling metrics in cross-project vulnerability prediction. Proceedings of the 31st Annual ACM Symposium on Applied Computing - SAC ’16, pages 1415–1421. Moshtari, S., Sami, A., & Azimi, M. (2013). Using complexity metrics to improve software security. Computer Fraud and Security, 2013(5), 8–17. Mumtaz, H., Alshayeb, M., Mahmood, S., & Niazi, M. (2018). An empirical study to improve software security through the application of code refactoring. Information and Software Technology, 96. Munaiah, N., Camilo, F., Wigham, W., Meneely, A., & Nagappan, M. (2017). Do bugs foreshadow vulnerabilities? An in-depth study of the chromium project. Empirical Software Engineering, 22(3), Munaiah, N., & Meneely, A. (2016). Beyond the Attack Surface: Assessing Security Risk with Random Walks on Call Graphs. Proceedings of the 2016 ACM Workshop on Software PROtection, pages 3–14. Muske, T., & Serebrenik, A. (2016). Survey of Approaches for Handling Static Analysis Alarms. in 2016 IEEE 16th International Working Conference on Source Code Analysis and Manipulation (SCAM), pages 157–166. NIST. (2018). SP 800-160: Systems Security Engineering Considerations for a Multidisciplinary Approach in the Engineering of Trustworthy Secure Systems. National Institute of Standards and Technology. Nunes, P., Medeiros, I., Fonseca, J., Neves, N., Correia, M., & Vieira, M. (2019). An empirical study on combining diverse static analysis tools for web security vulnerabilities based on development scenarios. Computing. Online (Last Accessed 04/27/2020). Supporting Material. https://sites.google.com/view/sec-model-supp Rindell, K., Bernsmed, K., & Jaatun, M. G. (2019). Managing security in software: Or: How i learned to stop worrying and manage the security technical debt. In Proceedings of the 14th International Conference on Availability, Reliability and Security, ARES ’19. Rindell, K., & Holvitie, J. (2019). Security risk assessment and management as technical debt. In International Workshop on Secure Software Engineering in DevOps and Agile Development. Roumani, Y., Nwankpa, J. K., & Roumani, Y. F. (2016). Examining the relationship between firm financial records and security vulnerabilities. International Journal of Information Management, 36(6), 987–994. Ruthruff, J., Penix, J., Morgenthaler, J., Elbaum, S., & Rothermel, G. (2008). Predicting accurate and actionable static analysis warnings. In 2008 ACM/IEEE 30th International Conference on Software Engineering, pages 341–350. IEEE. Saaty, T. L. (2008). Decision making with the analytic hierarchy process. International Journal of Services Sciences,. Scandariato, R., Walden, J., Hovsepyan, A., & Joosen, W. (2014). Predicting vulnerable software components via text mining. IEEE Transactions on Software Engineering, 40(10), 993–1006. Seacord, R. C. (2008). The CERT C secure coding standard. Pearson Education. Sentilles, S., Papatheocharous, E., and Ciccozzi, F. (2018). What do we know about software security evaluation? A preliminary study. In 6th International Workshop on Quantitative Approaches to Software Quality. Shin, W., Lee, J., Park, D., & Chang, C. (2014). Design of authenticity evaluation metric for android applications. In 2014 Fourth International Conference on Digital Information and Communication Technology and its Applications (DICTAP), pages 275–278. Shin, Y., Meneely, A., Williams, L., & Osborne, J. A. (2011). Evaluating Complexity, Code Churn, and Developer Activity Metrics as Indicators of Software Vulnerabilities. IEEE Transactions on Software Engineering, 37(6), 772–787. Shin, Y., & Williams, L. (2008a). Is Complexity Really the Enemy of Software Security. , Proc. the 4th ACM Workshop on Quality of Protection, Alexandria, Virginia, USA, Oct. Shin, Y., & Williams, L. A. (2008b). An empirical model to predict security vulnerabilities using code complexity metrics. In 2008 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. Siavvas, M., Chatzidimitriou, K., & Symeonidis, A. (2017a). QATCH - An adaptive framework for software product quality assessment. Expert Systems with,. Applications. Siavvas, M., Gelenbe, E., Kehagias, D., & Tzovaras, D. (2018a). Static analysis-based approaches for secure software development. In International ISCIS Security Workshop, pages 142–157. Springer. Siavvas, M., Jankovic, M., Kehagias, D., & Tzovaras, D. (2018b). Is Popularity an Indicator of Software Security? In 2018 IEEE 9th International Conference on Intelligent Systems (IS). Siavvas, M., Kehagias, D., & Tzovaras, D. (2017b). A preliminary study on the relationship among software metrics and specific vulnerability types. International Conference on Computational Science and Computational Intelligence (CSCI), 2017, 916–921. Siavvas, M., Marantos, C., Papadopoulos, L., Kehagias, D., Soudris, D., & Tzovaras, D. (2019a). On the Relationship between Software Security and Energy Consumption. In Proceedings of the 15th China-Europe International Symposium on Software Engineering Education. Siavvas, M., Tsoukalas, D., Jankovic, M., Kehagias, D., & Tzovaras, D. (2020a). Technical debt as an indicator of software security risk: a machine learning approach for software development enterprises. Enterprise Information Systems, 1–43. Siavvas, M., Tsoukalas, D., Jankovic, M., Kehagias, D., Tzovaras, D., Anicic, N., & Gelenbe, E. (2019b). An empirical evaluation of the relationship between technical debt and software security. In 9th International Conference on Information Society and Technology. Siavvas, M., Tsoukalas, D., Marantos, C., Tsintzira, A. A., Jankovic, M., Soudris, D., Chatzigeorgiou, A., & Kehagias, D. (2020b). The sdk4ed platform for embedded software quality improvement-preliminary overview. In International Conference on Computational Science and Its Applications, pages 1035–1050. Springer. Spearman, C. (1987). The proof and measurement of association between two things. By C. Spearman, 1904. The American journal of psychology, 100(3-4):441–471. Stuckman, J., Walden, J., & Scandariato, R. (2017). The effect of dimensionality reduction on software vulnerability prediction models. IEEE Transactions on Reliability, 66(1), 17–37. Sultana, K. Z., Deo, A., & Williams, B. J. (2017). Correlation analysis among java nano-patterns and software vulnerabilities. In 2017 IEEE 18th International Symposium on High Assurance Systems Engineering (HASE), pages 69–76. IEEE. Sultana, K. Z., Williams, B. J., & Bhowmik, T. (2019). A study examining relationships between micro patterns and security vulnerabilities. Software Quality Journal, 27(1), 5–41. Tang, Y., Zhao, F., Yang, Y., Lu, H., Zhou, Y., & Xu, B. (2015). Predicting Vulnerable Components via Text Mining or Software Metrics? An Effort-Aware Perspective. Proceedings - 2015 IEEE International Conference on Software Quality, Reliability and Security, QRS 2015, pages 27–36. Theisen, C., Munaiah, N., Al-Zyoud, M., Carver, J. C., Meneely, A., & Williams, L. (2018). Attack surface definitions: A systematic literature review. Information and Software Technology. Vale, G., Fernandes, E., & Figueiredo, E. (2019). On the proposal and evaluation of a benchmark-based threshold derivation method. Software Quality Journal, 27(1), 275–306. Verendel, V. (2009). Quantified security is a weak hypothesis. Proceedings of the 2009 workshop on New security paradigms workshop - NSPW ’09, page 37. Wagner, S. (2013). Software product quality control. Springer. Wagner, S., Goeb, A., Heinemann, L., Kläs, M., Lampasona, C., Lochmann, K., et al. (2015). Operationalised product quality models and assessment: The Quamoco approach. Information and Software Technology, 62, 101–123. Wagner, S., Lochmann, K., Heinemann, L., Klas, M., Trendowicz, A., Plosch, R., Seidi, A., Goeb, A., & Streit, J. (2012). The Quamoco product quality modelling and assessment approach. 2012 34th International Conference on Software Engineering (ICSE), pages 1133–1142. Walden, J. & Doyle, M. (2012). SAVI: Static-Analysis vulnerability indicator. IEEE Security and Privacy. Walden, J., Doyle, M., Welch, G. A., & Whelan, M. (2009). Security of open source web applications. 3rd International Symposium on Empirical Software Engineering and Measurement, ESEM 2009. Whitman, M. E., & Mattord, H. J. (2011). Principles of information security. Cengage Learning. Wolff, E. (2016). Microservices: flexible software architecture. Addison-Wesley. Xu, H., Heijmans, J., & Visser, J. (2013). A practical model for rating software security. Proceedings - 7th International Conference on Software Security and Reliability Companion, SERE-C 2013. Zafar, S., Mehboob, M., Naveed, A., & Malik, B. (2015). Security quality model: an extension of Dromey’s model. Software Quality Journal, 23(1), Zhang, M., & de Carnde Carnavalet, X., Wang, L., & Ragab, A., (2019). Large-scale empirical study of important features indicative of discovered vulnerabilities to assess application security. IEEE Transactions on Information Forensics and Security, 14(9), 2315–2330.