Debug-localize-repair: a symbiotic construction for heap manipulations

Springer Science and Business Media LLC - Tập 58 - Trang 399-439 - 2022
Sahil Verma1, Subhajit Roy2
1Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA
2Department of Computer Science and Engineering, Indian Institute of Technology Kanpur, Kanpur, India

Tóm tắt

We present Wolverine2, an integrated Debug-Localize-Repair environment for heap manipulating programs. Wolverine2 provides an interactive debugging environment: while concretely executing a program via on an interactive shell supporting common debugging facilities, Wolverine2 displays the abstract program states (as box-and-arrow diagrams) as a visual aid to the programmer, packages a novel, proof-directed repair algorithm to quickly synthesize the repair patches and a new bug localization algorithm to reduce the search space of repairs. Wolverine2 supports “hot-patching” of the generated patches to provide a seamless debugging environment, and also facilitates new debug-localize-repair possibilities: specification refinement and checkpoint-based hopping. We evaluate Wolverine2 on 6400 buggy programs (generated using automated fault injection) on a variety of data-structures like singly, doubly, and circular linked lists, AVL trees, Red-Black trees, Splay Trees and Binary Search Trees; Wolverine2 could repair all the buggy instances within realistic programmer wait-time (less than 5 s in most cases). Wolverine2 could also repair more than 80% of the 247 (buggy) student submissions where a reasonable attempt was made.

Tài liệu tham khảo

Abreu R, Zoeteweij P, Golsteijn R, Van Gemund AJ (2009) A practical evaluation of spectrum-based fault localization. J Syst Softw 82(11):1780–1792 Ball T, Naik M, Rajamani SK (2003) From symptom to cause: localizing errors in counterexample traces. In: Proceedings of the 30th ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL ’03, New York, NY, USA. ACM Bavishi R, Pandey A, Roy S (2016) Regression aware debugging for mobile applications. In: Mobile! 2016: proceedings of the 1st international workshop on mobile development (invited paper), Mobile! 2016, New York, NY, USA., pp 21–22 Association for Computing Machinery Bavishi R, Pandey A, Roy S (2016) To be precise: regression aware debugging. In: Proceedings of the 2016 ACM SIGPLAN international conference on object-oriented programming, systems, languages, and applications, OOPSLA 2016, New York, NY, USA. ACM Bendersky E. Pycparser: C parser in Python. https://pypi.python.org/pypi/pycparser. Accessed 24 Jan 2017 Bhatia S, Kohli P, Singh R (2018) Neuro-symbolic program corrector for introductory programming assignments. In: Proceedings of the 40th international conference on software engineering, ICSE ’18, New York, NY, USA. Association for Computing Machinery Chandra S, Torlak E, Barman S, Bodik R (2011) Angelic debugging. In: Proceedings of the 33rd international conference on software engineering, ICSE ’11, New York, NY, USA. ACM Chatterjee P, Chatterjee A, Campos J, Abreu R, Roy S (2020) Diagnosing software faults using multiverse analysis. In: Bessiere C (ed) Proceedings of the twenty-ninth international joint conference on artificial intelligence, IJCAI-20, vol 7. International Joint Conferences on Artificial Intelligence Organization, pp 1629–1635. Main track Chatterjee P, Roy S, Diep BP, Lal A (2020) Distributed bounded model checking. In: FMCAD, July 2020 Das R, Ahmed UZ, Karkare A, Gulwani S (2016) Prutor: a system for tutoring CS1 and collecting student programs for analysis. CoRR, arXiv:1608.03828 De Moura L, Bjørner N (2008) Z3: an efficient SMT solver. In: Proceedings of the theory and practice of software, 14th international conference on tools and algorithms for the construction and analysis of systems, Berlin, Heidelberg. Springer-Verlag, pp 337–340 Demsky B, Rinard M (2003) Automatic detection and repair of errors in data structures. In: Proceedings of the 18th annual ACM SIGPLAN conference on object-oriented programing, systems, languages, and applications, OOPSLA ’03, New York, NY, USA. ACM Elkarablieh B, Khurshid S (2008) Juzi: a tool for repairing complex data structures. In: Proceedings of the 30th international conference on software engineering, ICSE ’08, New York, NY, USA. ACM Feser JK, Chaudhuri S, Dillig I (2015) Synthesizing data structure transformations from input–output examples. In: Proceedings of the 36th ACM SIGPLAN conference on programming language design and implementation, PLDI ’15, New York, NY, USA. ACM Geeks for Geeks. Data structures. http://www.geeksforgeeks.org/data-structures/. Accessed 24 Jan 2017 Free Software Foundation. GDB Python API. https://sourceware.org/gdb/onlinedocs/gdb/Python-API.html. Accessed 25 Jan 2017 Free Software Foundation. GDB: the GNU project debugger. https://sourceware.org/gdb/. Accessed 24 Jan 2017 Garg A, Roy S (2015) Synthesizing heap manipulations via integer linear programming. In: Blazy S, Jensen T (eds) Static analysis, SAS 2015. Proceedings. Springer, Berlin Godefroid P, Klarlund N, Sen K (2005) DART: directed automated random testing. In: Proceedings of the 2005 ACM SIGPLAN conference on programming language design and implementation, PLDI ’05, New York, NY, USA. ACM Golia P, Roy S, Meel KS (2020) Manthan: a data-driven approach for Boolean function synthesis. In: Lahiri SK, Wang C (eds) Computer aided verification (CAV). Springer, Cham, pp 611–633 Golia P, Roy S, Slivovsky F, Meel KS (2021) Engineering an efficient Boolean functional synthesis engine. In: ICCAD Groce A, Chaki S, Kroening D, Strichman O (2006) Error explanation with distance metrics. Int J Softw Tools Technol Transf 8(3):229–247 Grumberg O, Lerda F, Strichman O, Theobald M (2005) Proof-guided underapproximation-widening for multi-process systems. In: Proceedings of the 32Nd ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL ’05, New York, NY, USA. ACM Gulwani S, Radiček I, Zuleger F (2018) Automated clustering and program repair for introductory programming assignments. In: Proceedings of the 39th ACM SIGPLAN conference on programming language design and implementation, PLDI 2018, New York, NY, USA. Association for Computing Machinery, pp 465–480 Guo PJ (2013) Online python tutor: embeddable web-based program visualization for CS education. In: Proceeding of the 44th ACM technical symposium on computer science education, SIGCSE ’13, New York, NY, USA. ACM Gupta R, Pal S, Kanade A, Shevade S (2017) Deepfix: fixing common C language errors by deep learning. In: Proceedings of the Thirty-First AAAI conference on artificial intelligence, AAAI’17. AAAI Press, pp 1345–1351 Hoare CAR (1969) An axiomatic basis for computer programming. Commun ACM 12(10):576–580 Hu Q, Samanta R, Singh R, D’Antoni L (2019) Direct manipulation for imperative programs. In: Chang BY (ed) Static analysis. Springer, Cham, pp 347–367 Hu Y, Ahmed UZ, Mechtaev S, Leong B, Roychoudhury A (2019) Re-factoring based program repair applied to programming assignments. In: ASE ’19. IEEE Press, pp 388–398 igraph—the network analysis package. http://igraph.org/python/. Accessed 24 Jan 2017 Jha S, Gulwani S, Seshia SA, Tiwari A (2010) Oracle-guided component-based program synthesis. In: Proceedings of the 32Nd ACM/IEEE international conference on software engineering, ICSE ’10, New York, NY, USA, vol 1. ACM Jones JA, Harrold MJ (2005) Empirical evaluation of the tarantula automatic fault-localization technique. In: Proceedings of the 20th IEEE/ACM international conference on automated software engineering, ASE ’05, New York, NY, USA. ACM Jose M, Majumdar R (2011) Bug-assist: assisting fault localization in ANSI-C programs. In: Proceedings of the 23rd international conference on computer aided verification, CAV’11, Berlin, Heidelberg. Springer-Verlag Jose M, Majumdar R (2011) Cause clue clauses: error localization using maximum satisfiability. In: Proceedings of the 32nd ACM SIGPLAN conference on programming language design and implementation, PLDI ’11, New York, NY, USA. ACM Kneuss E, Koukoutos M, Kuncak V (2015) Deductive program repair. In: CAV Le Goues C, Nguyen T, Forrest S, Weimer W (2012) GenProg: a generic method for automatic software repair. IEEE Trans Softw Eng 38:54–72 Leung A, Sarracino J, Lerner S (2015) Interactive parser synthesis by example. SIGPLAN Not 50(6):565–574 Liblit B, Naik M, Zheng AX, Aiken A, Jordan MI (2005) Scalable statistical bug isolation. In: Proceedings of the 2005 ACM SIGPLAN conference on programming language design and implementation, PLDI ’05, New York, NY, USA. ACM Liu C, Yan X, Fei L, Han J, Midkiff SP (2005) SOBER: statistical model-based bug localization. In: ESEC/FSE-13, New York, NY, USA. ACM Liu Y, Li B (2010) Automated program debugging via multiple predicate switching. In: Proceedings of the twenty-fourth AAAI conference on artificial intelligence, AAAI’10. AAAI Press Loncaric C, Ernst MD, Torlak E (2018) Generalized data structure synthesis. In: Proceedings of the 40th international conference on software engineering, ICSE ’18, New York, NY, USA. Association for Computing Machinery, pp 958–968 Long F, Rinard M (2016) Automatic patch generation by learning correct code. In: Proceedings of the 43rd annual ACM SIGPLAN-SIGACT symposium on principles of programming languages, POPL ’16, New York, NY, USA. ACM Malik MZ, Siddiqui JH, Khurshid S (2011) Constraint-based program debugging using data structure repair. In: 2011 fourth IEEE international conference on software testing, verification and validation, March 2011 Malik MZ, Ghori K, Elkarablieh B, Khurshid S (2009) A case for automated debugging using data structure repair. In: Proceedings of the 2009 IEEE/ACM international conference on automated software engineering, ASE ’09, USA. IEEE Computer Society Mechtaev S, Yi J, Roychoudhury A (2015) DirectFix: looking for simple program repairs. In: Proceedings of the 37th international conference on software engineering, ICSE ’15, Piscataway, NJ, USA, vol 1, pp 448–458 Mechtaev S, Yi J, Roychoudhury A (2016) Angelix: scalable multiline program patch synthesis via symbolic analysis. In: Proceedings of the 38th international conference on software engineering, ICSE ’16, New York, NY, USA. ACM Modi V, Roy S, Aggarwal SK (2013) Exploring program phases for statistical bug localization. In: Proceedings of the 11th ACM SIGPLAN-SIGSOFT workshop on program analysis for software tools and engineering, PASTE ’13, New York, NY, USA. ACM Nguyen HD, Qi D, Roychoudhury A, Chandra S (2013) SemFix: program repair via semantic analysis. In: Proceedings of the 2013 international conference on software engineering, ICSE ’13, Piscataway, NJ, USA. IEEE Press Nguyen T-T, Ta Q-T, Chin W-N (2019) Automatic program repair using formal verification and expression templates. In: VMCAI Nguyen T, Weimer W, Le Goues C, Forrest S (2009) Using execution paths to evolve software patches. In: International conference on software testing, verification and validation workshops, 2009. ICSTW’09. IEEE, pp 152–153 Pandey A, Kotcharlakota PR, Roy S (2019) Deferred concretization in symbolic execution via fuzzing. In: Proceedings of the 28th ACM SIGSOFT international symposium on software testing and analysis, ISSTA 2019, New York, NY, USA. Association for Computing Machinery, pp 228–238 Pham V-T, Khurana S, Roy S, Roychoudhury A (2017) Bucketing failing tests via symbolic analysis. In: Huisman M, Rubin J (eds) Fundamental approaches to software engineering. Springer, Berlin, pp 43–59 Polikarpova N, Sergey I (2019) Structuring the synthesis of heap-manipulating programs. Proc ACM Program Lang 3:1–30 Pu Y, Narasimhan K, Solar-Lezama A, Barzilay R (2016) Sk_p: a neural program corrector for moocs. In: Companion proceedings of the 2016 ACM SIGPLAN international conference on systems, programming, languages and applications: software for humanity, SPLASH Companion 2016, page 39-40, New York, NY, USA. Association for Computing Machinery Rolim R, Soares G, D’Antoni L, Polozov O, Gulwani S, Gheyi R, Suzuki R, Hartmann B (2017) Learning syntactic program transformations from examples. In: Proceedings of the 39th international conference on software engineering, ICSE ’17. IEEE Press Roy S, Hsu J, Albarghouthi A (2021) Learning differentially private mechanisms. In: 2021 IEEE symposium on security and privacy (SP), Los Alamitos, CA, USA, May 2021. IEEE Computer Society, pp 852–865 Roy S (2013) From concrete examples to heap manipulating programs. In: Logozzo F, Fähndrich M (eds) Static analysis: 20th international symposium, SAS 2013, Seattle, WA, USA, 20–22 June 2013. Proceedings. Springer, Berlin Roy S, Pandey A, Dolan-Gavitt B, Hu Y (2018) Bug synthesis: challenging bug-finding tools with deep faults. In: Proceedings of the 2018 26th ACM joint meeting on European software engineering conference and symposium on the foundations of software engineering, ESEC/FSE 2018, New York, NY, USA. Association for Computing Machinery, pp 224–234 Sen K, Marinov D, Agha G (2005) CUTE: a concolic unit testing engine for C. In: ESEC/FSE-13, New York, NY, USA. ACM Singal D, Agarwal P, Jhunjhunwala S, Roy S (2018) Parse condition: symbolic encoding of ll(1) parsing. In: Barthe G, Sutcliffe G, Veanes M (eds) LPAR-22. 22nd international conference on logic for programming, artificial intelligence and reasoning. EPiC series in computing, vol 57. EasyChair, pp 637–655 Singh R, Gulwani S, Solar-Lezama A (2013) Automated feedback generation for introductory programming assignments. In: Proceedings of the 34th ACM SIGPLAN conference on programming language design and implementation, PLDI ’13, New York, NY, USA. Association for Computing Machinery Singh R, Solar-Lezama A (2011) Synthesizing data structure manipulations from storyboards. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on foundations of software engineering, ESEC/FSE ’11, New York, NY, USA. ACM Tan SH, Yoshida H, Prasad MR, Roychoudhury A (2016) Anti-patterns in search-based program repair. In: Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering, FSE 2016, New York, NY, USA. Association for Computing Machinery, pp 727–738 Verma A, Kalita PK, Pandey A, Roy S (2020) Interactive debugging of concurrent programs under relaxed memory models. In: Proceedings of the 18th ACM/IEEE international symposium on code generation and optimization, CGO 2020, New York, NY, USA. Association for Computing Machinery, pp 68–80 Verma S, Roy S (2017) Synergistic debug-repair of heap manipulations. In: Proceedings of the 2017 11th joint meeting on foundations of software engineering, ESEC/FSE 2017, New York, NY, USA. ACM Wang K, Singh R, Su Z (2017) Dynamic neural program embedding for program repair. arXiv, arXiv:1711.07163 Wang K, Singh R, Su Z (2018) Search, align, and repair: data-driven feedback generation for introductory programming exercises. In: Proceedings of the 39th ACM SIGPLAN conference on programming language design and implementation, PLDI 2018,, New York, NY, USA. Association for Computing Machinery pp 481–495 Weimer W (2006) Patches as better bug reports. In: Proceedings of the 5th international conference on generative programming and component engineering, GPCE ’06, New York, NY, USA. ACM Weimer W, Nguyen T, Le Goues C, Forrest S (2009) Automatically finding patches using genetic programming. In: Proceedings of the 31st international conference on software engineering, ICSE ’09, Washington, DC, USA. IEEE Computer Society Zeller A (2002) Isolating cause-effect chains from computer programs. In: Proceedings of the 10th ACM SIGSOFT symposium on foundations of software engineering, SIGSOFT ’02/FSE-10, New York, NY, USA. ACM Zeller A, Hildebrandt R (2002) Simplifying and isolating failure-inducing input. IEEE Trans Softw Eng 28(2):183–200 Zimmermann T, Zeller A (2002) Visualizing memory graphs. In: Diehl S (ed) Revised lectures on software visualization, international seminar. Springer-Verlag, London