Seeding Grammars in Grammatical Evolution to Improve Search-Based Software Testing

SN Computer Science - Tập 2 - Trang 1-19 - 2021
Muhammad Sheraz Anjum1, Conor Ryan1
1Biocomputing and Developmental Systems Group, Department of Computer Science and Information Systems, Lero – Science Foundation Ireland Research Centre for Software, University of Limerick, Limerick, Ireland

Tóm tắt

Heuristic-based optimization techniques have been increasingly used to automate different types of code coverage analysis. Several studies suggest that interdependencies (in the form of comparisons) may exist between the condition constructs, of variables and constant values, in the branching conditions of real-world programs, e.g. ( $$i \le 100$$ ) or ( $$i==j$$ ), etc. In this work, by interdependencies we refer to the situations where, to satisfy a branching condition, there must be a certain relationship between the values of some specific condition constructs (which may or may not be a part of the respective condition predicates). For example, the values of variables i and j must be equal to satisfy the condition of ( $$i==j$$ ), and the value of variable k must be equal to 100 for the satisfaction of the condition of ( $$k==100$$ ). To date, only the Ariadne, a Grammatical Evolution (GE)-based system, exploits these interdependencies between input variables (e.g. of the form ( $$i \le j$$ ) or ( $$i==j$$ ), etc.) to efficiently generate test data. Ariadne employs a simple attribute grammar to exploit these dependencies, which enables it to evolve complex test data, and has been compared favourably to other well-known techniques in the literature. However, Ariadne does not benefit from interdependencies involving constants, e.g. ( $$i \le 100$$ ) or ( $$j==500$$ ), etc., due to the difficulty in evolving precise values, and these are equally important constructs of condition predicates. Furthermore, constant creation in GE can be difficult, particularly with high precision. We propose to seed the grammar with constants extracted from the source code of the program under test to enhance and extend Ariadne’s capability to exploit richer types of dependencies (involving all combinations of both variables and constant values). We compared our results with the original system of Ariadne against a large set of benchmark problems which include 10 numeric programs in addition to the ones originally used for Ariadne. Our results demonstrate that the seeding strategy not only dramatically improves the generality of the system, as it improves the code coverage (effectiveness) by impressive margins, but it also reduces the search budgets (efficiency) often up to an order of magnitude. Moreover, we also performed a rigorous analysis to investigate the scalability of our improved Ariadne, showing that it stays highly scalable when compared to both the original system of Ariadne and GA-based test data generation approach.

Tài liệu tham khảo

Afzal W, Torkar R, Feldt R. A systematic review of search-based testing for non-functional system properties. Inf Softw Technol. 2009;51(6):957–76. Aleti A, Buhnova B, Grunske L, Koziolek A, Meedeniya I. Software architecture optimization methods: a systematic literature review. IEEE Trans Softw Eng. 2013;39(5):658–83. Ali S, Briand LC, Hemmati H, Panesar-Walawege RK. A systematic review of the application and empirical investigation of search-based test case generation. IEEE Trans Softw Eng. 2010;36(6):742–62. Alshahwan N, Harman M. Automated web application testing using search based software engineering. In: Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society; 2011. p. 3–12. Anand S, Burke EK, Chen TY, Clark J, Cohen MB, Grieskamp W, Harman M, Harrold MJ, Mcminn P, Bertolino A, et al. An orchestrated survey of methodologies for automated software test case generation. J Syst Softw. 2013;86(8):1978–2001. Anjum MS, Ryan C. Ariadne: Evolving test data using grammatical evolution. In: European Conference on Genetic Programming. Springer; 2019. Anjum MS, Ryan C. Scalability analysis of grammatical evolution based test data generation. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, p. 1213–21, 2020. Anjum MS, Ryan C. Seeding grammars in grammatical evolution to improve search based software testing. In: European Conference on Genetic Programming (Part of EvoStar). Springer; 2020. p. 18–34. Azad RMA, Ryan C. The best things don’t always come in small packages: constant creation in grammatical evolution. In: European Conference on Genetic Programming. Springer; 2014. p. 186–97. Baars A, Harman M, Hassoun Y, Lakhotia K, McMinn P, Tonella P, Vos T. Symbolic search-based testing. In: Proceedings of the 2011 26th IEEE/ACM International Conference on Automated Software Engineering. IEEE Computer Society; 2011. p. 53–62. Barros RC, Basgalupp MP, Cerri R, da Silva TS, de Carvalho ACPLF. A grammatical evolution approach for software effort estimation. In: Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation. ACM; 2013. p. 1413–20. Beizer B. Software testing techniques, van nostrand reinhold, 2nd edn. New York: Inc; 1990. ISBN 0-442-20672-0 bibclean.c. 1995. http://www.cs.bham.ac.uk/~wbl/biblio/tools/bibclean.c. Accessed 15 Sept 2019. Bidgoli AM, Haghighi H. A new approach for search space reduction and seeding by analysis of the clauses. In: International Symposium on Search Based Software Engineering. Springer; 2018. p. 343–8. Chen T, Li M, Yao . On the effects of seeding strategies: a case for search-based multi-objective service composition. In: Proceedings of the Genetic and Evolutionary Computation Conference. ACM; 2018. p. 1419–26. Chen T, Li M, Yao X. Standing on the shoulders of giants: seeding search-based multi-objective optimization with prior knowledge for software service composition. Inf Softw Technol. 2019;114:155–75. Clarke LA. A system to generate test data and symbolically execute programs. IEEE Trans Softw Eng. 1976;3:215–22. Cohen EI. A finite domain-testing strategy for computer program testing. PhD thesis, The Ohio State University, 1978. de Andrade J, Silva L, Britto A, Amaral R. Solving the software project scheduling problem with hyper-heuristics. In: International Conference on Artificial Intelligence and Soft Computing. Springer; 2019. p. 399–411. DeMilli RA, Offutt AJ. Constraint-based automatic test data generation. IEEE Trans Softw Eng. 1991;17(9):900–10. Dempsey I, O’Neill M, Brabazon A. Constant creation in grammatical evolution. Int J Innov Comput Appl. 2007;1(1):23–38. Elshoff JL. An analysis of some commercial PL/I programs. IEEE Trans Softw Eng. 1976;2:113–20. Ferguson R, Korel B. The chaining approach for software test data generation. ACM Trans Softw Eng Methodol (TOSEM). 1996;5(1):63–86. Fraser G, Arcuri A. The seed is strong: Seeding strategies in search-based software testing. In: 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation. IEEE; 2012. p. 121–30. Fraser G, Arcuri A. Whole test suite generation. IEEE Trans Softw Eng. 2013;39(2):276–91. Fraser G, Arcuri A, McMinn P. A memetic algorithm for whole test suite generation. J Syst Softw. 2015;103:311–27. Fraser G, Zeller A. Exploiting common object usage in test case generation. In: 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation. IEEE; 2011. p. 80–9. Galeotti JP, Fraser G, Arcuri A. Improving search-based test suite generation with dynamic symbolic execution. In: 2013 IEEE 24th International Symposium on Software Reliability Engineering (ISSRE). IEEE; 2013. p. 360–9 Harman M, Jia Y, Zhang Y. Achievements, open problems and challenges for search based software testing. In: 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST). IEEE; 2015. p. 1–12. Harman M, McMinn P. A theoretical and empirical study of search-based testing: local, global, and hybrid search. IEEE Trans Softw Eng. 2010;36(2):226–47. Holland JH. Genetic algorithms. Sci Am. 1992;267(1):66–73. Inkumsah K, Xie T. Evacon: a framework for integrating evolutionary and concolic testing for object-oriented programs. In: Proceedings of the Twenty-Second IEEE/ACM International Conference on Automated Software Engineering. ACM; 2007. p. 425–8. Jin R, Jiang S, Zhang H. Generation of test data based on genetic algorithms and program dependence analysis. In: 2011 IEEE International Conference on Cyber Technology in Automation, Control, and Intelligent Systems. IEEE; 2011. p. 116–21. Jones BF, Sthamer H-H, Eyres DE. Automatic structural testing using genetic algorithms. Softw Eng J. 1996;11(5):299–306. Kifetew FM, Jin W, Tiella R, Orso A, Tonella P. Reproducing field failures for programs with complex grammar-based input. In: 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation. IEEE; 2014. p. 163–72. Korel B. Automated software test data generation. IEEE Trans Softw Eng. 1990;16(8):870–9. Lima JAP, Vergilio SR, et al. Automatic generation of search-based algorithms applied to the feature testing of software product lines. In: Proceedings of the 31st Brazilian Symposium on Software Engineering. ACM; 2017. p. 114–23. Lopez-Herrejon RE, Ferrer J, Chicano F, Egyed A, Alba E. Comparative analysis of classical multi-objective evolutionary algorithms and seeding strategies for pairwise testing of software product lines. In: 2014 IEEE Congress on Evolutionary Computation (CEC). IEEE; 2014. p. 387–96. Mariani T, Guizzo G, Vergilio SR, Pozo ATR. Grammatical evolution for the multi-objective integration and test order problem. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016. ACM; 2016. p. 1069–76. McDermott J, White DR, Luke S, Manzoni L, Castelli M, Vanneschi L, Jaskowski W, Krawiec K, Harper R, De Jong K, et al. Genetic programming needs better benchmarks. In: Proceedings of the 14th Annual Conference on Genetic and Evolutionary Computation, p. 791–8, 2012. McMinn P. Search-based software test data generation: a survey. Softw Test Verif Reliab. 2004;14(2):105–56. McMinn P, Stevenson M, Harman M. Reducing qualitative human oracle costs associated with automatically generated test data. In: Proceedings of the First International Workshop on Software Test Output Validation. ACM; 2010. p. 1–4. Michael CC, McGraw G, Schatz MA. Generating software test data by evolution. IEEE Trans Softw Eng. 2001;12:1085–110. Miller J, Reformat M, Zhang H. Automatic test data generation using genetic algorithm and program dependence graphs. Inf Softw Technol. 2006;48(7):586–605. Miller W, Spooner DL. Automatic generation of floating-point test data. IEEE Trans Softw Eng. 1976;3:223–6. Myers GJ, Badgett T, Thomas TM, Sandler C. The art of software testing, vol. 2. New York: Wiley Online Library; 2004. Offutt AJ, Jin Z, Pan J. The dynamic domain reduction procedure for test data generation. Softw Pract Exp. 1999;29(2):167–93. O’Neill M, Ryan C. Grammatical evolution. IEEE Trans Evol Comput. 2001;5(4):349–58. Panichella A, Kifetew FM, Tonella P. Reformulating branch coverage as a many-objective optimization problem. In: 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST). IEEE; 2015. p. 1–10. Pargas RP, Harrold MJ, Peck RR. Test-data generation using genetic algorithms. Softw Test Verif Reliab. 1999;9(4):263–82. Patten JV, Ryan C. Procedural content generation for games using grammatical evolution and attribute grammars. 2014. Rojas JM, Fraser G, Arcuri A. Seeding strategies in search-based unit test generation. Softw Test Verif Reliab. 2016;26(5):366–401. Ryan C, Collins JJ, Neill MO. Grammatical evolution: Evolving programs for an arbitrary language. In: European Conference on Genetic Programming. Springer; 1998. p. 83–96. Sauder RL. A general test data generator for cobol. In: Proceedings of the May 1–3, 1962, Spring Joint Computer Conference, p. 317–23, 1962. Sparks S, Embleton S, Cunningham R, Zou C. Automated vulnerability analysis: leveraging control flow for evolutionary input crafting. In: Twenty-Third Annual Computer Security Applications Conference (ACSAC 2007). IEEE; 2007. p. 477–86. Tlili M, Wappler S, Sthamer H. Improving evolutionary real-time testing. In: Proceedings of the 8th Annual Conference on Genetic and Evolutionary Computation. ACM; 2006. p. 1917–24. Tracey N, Clark J, Mander K, McDermid J. An automated framework for structural test-data generation. In: ASE. IEEE; 1998. p. 285. Wegener J, Baresel A, Sthamer H. Evolutionary test environment for automatic structural testing. Inf Softw Technol. 2001;43(14):841–54. Xanthakis S, Ellis C, Skourlas C, Le Gall A, Katsikas S, Karapoulios K. Application of genetic algorithms to software testing. In: Proceedings of the 5th International Conference on Software Engineering and Applications, p. 625–36, 1992. Zhu Z, Jiao L, Xu X. Combining search-based testing and dynamic symbolic execution by evolvability metric. In: 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE; 2018. p. 59–68.