Static data-flow analysis for software product lines in C

Automated Software Engineering - Tập 29 - Trang 1-37 - 2022
Philipp Dominik Schubert1, Paul Gazzillo2, Zach Patterson3, Julian Braha2, Fabian Schiebel4, Ben Hermann5, Shiyi Wei3, Eric Bodden1,4
1Paderborn University, Paderborn, Germany
2University of Central Florida, Florida, USA
3University of Texas at Dallas, Dallas, USA
4Fraunhofer IEM, Paderborn, Germany
5Technische Universität Dortmund, Dortmund, Germany

Tóm tắt

Many critical codebases are written in C, and most of them use preprocessor directives to encode variability, effectively encoding software product lines. These preprocessor directives, however, challenge any static code analysis. SPLlift, a previously presented approach for analyzing software product lines, is limited to Java programs that use a rather simple feature encoding and to analysis problems with a finite and ideally small domain. Other approaches that allow the analysis of real-world C software product lines use special-purpose analyses, preventing the reuse of existing analysis infrastructures and ignoring the progress made by the static analysis community. This work presents VarAlyzer, a novel static analysis approach for software product lines. VarAlyzer first transforms preprocessor constructs to plain C while preserving their variability and semantics. It then solves any given distributive analysis problem on transformed product lines in a variability-aware manner. VarAlyzer ’s analysis results are annotated with feature constraints that encode in which configurations each result holds. Our experiments with 95 compilation units of OpenSSL show that applying VarAlyzer enables one to conduct inter-procedural, flow-, field- and context-sensitive data-flow analyses on entire product lines for the first time, outperforming the product-based approach for highly-configurable systems.

Tài liệu tham khảo

Artifacts: supplementary material (2021). https://drive.google.com/drive/folders/1ESiSu5iKsFTrM2XqN3Oj4fhIqVfdQ93W?usp=sharing Arzt, S., Rasthofer, S., Fritz, C., Bodden, E., Bartel, A., Klein, J., Le Traon, Y., Octeau, D., McDaniel, P.: Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps. In: Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’14, pp. 259–269. ACM, New York, NY, USA (2014). https://doi.org/10.1145/2594291.2594299 Bison: bison. https://www.gnu.org/software/bison/ (2020) Bodden, E., Tolêdo, T., Ribeiro, M., Brabrand, C., Borba, P., Mezini, M.: Spllift: Statically analyzing software product lines in minutes instead of years. In: Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’13, pp. 355–364. ACM, New York, NY, USA (2013). https://doi.org/10.1145/2491956.2491976 Bodden, E.: The secret sauce in efficient and precise static analysis: The beauty of distributive, summary-based static analyses (and how to master them). In: Companion Proceedings for the ISSTA/ECOOP 2018 Workshops, ISSTA ’18, pp. 85–93. ACM, New York, NY, USA (2018). https://doi.org/10.1145/3236454.3236500 Brabrand, C., Ribeiro, M., Tolêdo, T., Borba, P.: Intraprocedural dataflow analysis for software product lines. In: Proceedings of the 11th Annual International Conference on Aspect-Oriented Software Development, AOSD ’12, pp. 13–24. Association for Computing Machinery, New York, NY, USA (2012). https://doi.org/10.1145/2162049.2162052 Chen, S., Erwig, M., Walkingshaw, E.: An error-tolerant type system for variational lambda calculus. In: Proceedings of the 17th ACM SIGPLAN International Conference on Functional Programming, ICFP ’12, pp. 29–40. Association for Computing Machinery, New York, NY, USA (2012). https://doi.org/10.1145/2364527.2364535 Clang tidy: clang-tidy (2018). http://clang.llvm.org/extra/clang-tidy/ Classen, A., Cordy, M., Schobbens, P.Y., Heymans, P., Legay, A., Raskin, J.F.: Featured transition systems: foundations for verifying variability-intensive systems and their application to LTL model checking. IEEE Trans. Softw. Eng. 39(8), 1069–1089 (2013). https://doi.org/10.1109/TSE.2012.86 CodeSonar, G.: Grammatech codesonar (2018). https://www.grammatech.com/products/codesonar Coverity-(SAST): Coverity static application security testing (SAST) (2018). https://www.synopsys.com/software-integrity/security-testing/static-analysis-sast.html Cuoq, P., Kirchner, F., Kosmatov, N., Prevosto, V., Signoles, J., Yakobowski, B.: Frama-C: a software analysis perspective. In: Proceedings of the 10th International Conference on Software Engineering and Formal Methods, SEFM’12, pp. 233–247. Springer-Verlag, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33826-7_16 de Moura, L., Bjørner, N.: Z3: An efficient smt solver. In: C.R. Ramakrishnan, J. Rehof (eds.) Tools and Algorithms for the Construction and Analysis of Systems, pp. 337–340. Springer Berlin Heidelberg, Berlin, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78800-3_24 Dimovski, A.S.: Symbolic game semantics for model checking program families. In: Bošnački, D., Wijs, A. (eds.) Model Checking Software, pp. 19–37. Springer International Publishing, Cham (2016). https://doi.org/10.1007/978-3-319-32582-8_2 Ernst, M.D., Badros, G.J., Notkin, D.: An empirical analysis of C preprocessor use. IEEE Trans. Softw. Eng. 28(12), 1146–1170 (2002). https://doi.org/10.1109/TSE.2002.1158288 FileVaultBug: Apple security blunder exposes lion login passwords in clear text. https://www.zdnet.com/article/apple-security-blunder-exposes-lion-login-passwords-in-clear-text/ (2012) Garrido, A., Johnson, R.: Analyzing multiple configurations of a C program. In: Proceedings of the 21st IEEE International Conference on Software Maintenance, ICSM ’05, pp. 379–388. IEEE Computer Society, USA (2005). https://doi.org/10.1109/ICSM.2005.23 Gazzillo, P., Grimm, R.: Superc: parsing all of C by taming the preprocessor. In: Vitek, J., Lin, H., Tip, F. (eds.) ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, Beijing, China - June 11–16, 2012, pp. 323–334. ACM (2012). https://doi.org/10.1145/2254064.2254103 GCC-Optimize-Options: GCC optimize options (2018). https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html Hercules: Hercules. https://github.com/joliebig/Hercules (2020) Hermann, B., Reif, M., Eichberg, M., Mezini, M.: Getting to know you: Towards a capability model for java. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ESEC/FSE 2015, pp. 758–769. ACM, New York, NY, USA (2015). https://doi.org/10.1145/2786805.2786829 Holzinger, P., Hermann, B., Lerch, J., Bodden, E., Mezini, M.: Hardening java’s access control by abolishing implicit privilege elevation. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 1027–1040 (2017). https://doi.org/10.1109/SP.2017.16 ICCOptimizeOptions: Intel®c++ compiler 19.0 developer guide and reference: Interprocedural optimization (IPO) (2018). https://software.intel.com/en-us/cpp-compiler-developer-guide-and-reference-interprocedural-optimization-ipo Iosif-Lazar, A.F., Melo, J., Dimovski, A.S., Brabrand, C., Wasowski, A.: Effective analysis of C programs by rewriting variability. CoRR (2017). arxiv:1701.08114 Kästner, C., Giarrusso, P.G., Rendel, T., Erdweg, S., Ostermann, K., Berger, T.: Variability-aware parsing in the presence of lexical macros and conditional compilation. In: Proceedings of the 2011 ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’11, pp. 805–824. Association for Computing Machinery, New York, NY, USA (2011). https://doi.org/10.1145/2048066.2048128 Kästner, C., Ostermann, K., Erdweg, S.: A variability-aware module system. In: Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’12, pp. 773–792. Association for Computing Machinery, New York, NY, USA (2012). https://doi.org/10.1145/2384616.2384673 Kästner, C., Thum, T., Saake, G., Feigenspan, J., Leich, T., Wielgorz, F., Apel, S.: Featureide: A tool framework for feature-oriented software development. In: 2009 IEEE 31st International Conference on Software Engineering, pp. 611–614. IEEE (2009). https://doi.org/10.1109/ICSE.2009.5070568 Kästner, C.: Virtual separation of concerns: toward preprocessors 2.0. Ph.D. thesis, Otto von Guericke University Magdeburg (2010). https://doi.org/10.1524/itit.2012.0662. http://edoc.bibliothek.uni-halle.de/servlets/DocumentServlet?id=8044 Kästner, C., Apel, S., Thüm, T., Saake, G.: Type checking annotation-based product lines. ACM Trans. Softw. Eng. Methodol. (2012). https://doi.org/10.1145/2211616.2211617 Kenner, A., Kästner, C., Haase, S., Leich, T.: Typechef: Toward type checking #ifdef variability in C. In: Proceedings of the 2Nd International Workshop on Feature-Oriented Software Development, FOSD ’10, pp. 25–32. ACM, New York, NY, USA (2010). https://doi.org/10.1145/1868688.1868693 Krüger, S., Nadi, S., Reif, M., Ali, K., Mezini, M., Bodden, E., Göpfert, F., Günther, F., Weinert, C., Demmler, D., Kamath, R.: Cognicrypt: Supporting developers in using cryptography. In: Proceedings of the 32Nd IEEE/ACM International Conference on Automated Software Engineering, ASE 2017, pp. 931–936. IEEE Press, Piscataway, NJ, USA (2017). http://dl.acm.org/citation.cfm?id=3155562.3155681 Le, W., Pattison, S.D.: Patch verification via multiversion interprocedural control flow graphs. In: Proceedings of the 36th International Conference on Software Engineering, ICSE 2014, pp. 1047–1058. Association for Computing Machinery, New York, NY, USA (2014). https://doi.org/10.1145/2568225.2568304 Liebig, J., Kästner, C., Apel, S.: Analyzing the discipline of preprocessor annotations in 30 million lines of c code. In: Proceedings of the Tenth International Conference on Aspect-Oriented Software Development, AOSD ’11, pp. 191–202. Association for Computing Machinery, New York, NY, USA (2011). https://doi.org/10.1145/1960275.1960299 Liebig, J., von Rhein, A., Kästner, C., Apel, S., Dörre, J., Lengauer, C.: Scalable analysis of variable software. In: Meyer, B., Baresi, L., Mezini, M. (eds.) Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE’13, Saint Petersburg, Russian Federation, August 18-26, 2013, pp. 81–91. ACM (2013). https://doi.org/10.1145/2491411.2491437 Livshits, V.B., Lam, M.S.: Finding security vulnerabilities in java applications with static analysis. In: Proceedings of the 14th Conference on USENIX Security Symposium - Volume 14, SSYM’05, pp. 18–18. USENIX Association, Berkeley, CA, USA (2005). http://dl.acm.org/citation.cfm?id=1251398.1251416 McCloskey, B., Brewer, E.: Astec: A new approach to refactoring c. In: Proceedings of the 10th European Software Engineering Conference Held Jointly with 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ESEC/FSE-13, pp. 21–30. Association for Computing Machinery, New York, NY, USA (2005). https://doi.org/10.1145/1081706.1081712 Midtgaard, J., Dimovski, A.S., Brabrand, C., Wasowski, A.: Systematic derivation of correct variability-aware program analyses. Sci. Comput. Program. 105, 145–170 (2015). https://doi.org/10.1016/j.scico.2015.04.005 Onlinedocs, G.: Gcc onlinedocs – cpp 3.4 stringizing (2021). https://gcc.gnu.org/onlinedocs/gcc-11.2.0/cpp/Stringizing.html#Stringizing Reps, T., Horwitz, S., Sagiv, M.: Precise interprocedural dataflow analysis via graph reachability. In: Proceedings of the 22Nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’95, pp. 49–61. ACM, New York, NY, USA (1995). https://doi.org/10.1145/199448.199462 Reps, T., Schwoon, S., Jha, S.: Weighted pushdown systems and their application to interprocedural dataflow analysis. In: Proceedings of the 10th International Conference on Static Analysis, SAS’03, pp. 189–213. Springer-Verlag, Berlin, Heidelberg (2003). http://dl.acm.org/citation.cfm?id=1760267.1760283 Rhein, A.V., Liebig, J., Janker, A., Kästner, C., Apel, S.: Variability-aware static analysis at scale: an empirical study. ACM Trans. Softw. Eng. Methodol. (2018). https://doi.org/10.1145/3280986 Sagiv, M., Reps, T., Horwitz, S.: Precise interprocedural dataflow analysis with applications to constant propagation. Theor. Comput. Sci. 167(1–2), 131–170 (1996). https://doi.org/10.1016/0304-3975(96)00072-2 Schubert, P.D., Hermann, B., Bodden, E.: Phasar: An inter-procedural static analysis framework for c/c++. In: T. Vojnar, L. Zhang (eds.) Tools and Algorithms for the Construction and Analysis of Systems, pp. 393–410. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-17465-1_22 Sharir, M., Pnueli, A.: Two approaches to interprocedural data flow analysis. New York Univ. Comput. Sci. Dept., New York, NY (1978). https://cds.cern.ch/record/120118 Strom, R.E.: Mechanisms for compile-time enforcement of security. In: Proceedings of the 10th ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages, POPL ’83, pp. 276–284. ACM, New York, NY, USA (1983). https://doi.org/10.1145/567067.567093 Strom, R.E., Yemini, S.: Typestate: A programming language concept for enhancing software reliability. IEEE Trans. Softw. Eng. 12(1), 157–171 (1986). https://doi.org/10.1109/TSE.1986.6312929 Thüm, T., Apel, S.: Analysis strategies for software product lines. none (2012). https://www.cs.cmu.edu/~ckaestne/pdf/tr_analysis12.pdf Walkingshaw, E., Kästner, C., Erwig, M., Apel, S., Bodden, E.: Variational data structures: Exploring tradeoffs in computing with variability. In: Black, A.P., Krishnamurthi, S., Bruegge, B., Ruskiewicz, J.N. (eds.) Onward! 2014, Proceedings of the 2014 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software, part of SPLASH ’14, Portland, OR, USA, October 20-24, 2014, pp. 213–226. ACM (2014). https://doi.org/10.1145/2661136.2661143