Active learning for extended finite state machines

Formal Aspects of Computing - Tập 28 - Trang 233-263 - 2016
Sofia Cassel1, Falk Howar2, Bengt Jonsson1, Bernhard Steffen3
1Department of Information Technology, Uppsala University, Uppsala, Sweden
2IPSSE, TU Clausthal, Clausthal-Zellerfeld, Germany
3Chair for Programming Systems, TU Dortmund, Dortmund, Germany

Tóm tắt

We present a black-box active learning algorithm for inferring extended finite state machines (EFSM)s by dynamic black-box analysis. EFSMs can be used to model both data flow and control behavior of software and hardware components. Different dialects of EFSMs are widely used in tools for model-based software development, verification, and testing. Our algorithm infers a class of EFSMs called register automata. Register automata have a finite control structure, extended with variables (registers), assignments, and guards. Our algorithm is parameterized on a particular theory, i.e., a set of operations and tests on the data domain that can be used in guards. Key to our learning technique is a novel learning model based on so-called tree queries. The learning algorithm uses tree queries to infer symbolic data constraints on parameters, e.g., sequence numbers, time stamps, identifiers, or even simple arithmetic. We describe sufficient conditions for the properties that the symbolic constraints provided by a tree query in general must have to be usable in our learning model. We also show that, under these conditions, our framework induces a generalization of the classical Nerode equivalence and canonical automata construction to the symbolic setting. We have evaluated our algorithm in a black-box scenario, where tree queries are realized through (black-box) testing. Our case studies include connection establishment in TCP and a priority queue from the Java Class Library.

Tài liệu tham khảo

Ammons G, Bodík R, Larus JR (2002) Mining specifications. In: Proc. POPL 2002, pp 4–16. ACM Alur R, Cerný P, Madhusudan P, Nam W (2005) Synthesis of interface specifications for Java classes. In: Proc. POPL 2005, pp 98–109. ACM Aarts F, Ruiter JD, Poll E (2013) Formal models of bank cards for free. In: Proc. ICSTW 2013, pp 461–468. IEEE Aarts F, Heidarian F, Kuppens H, Olsen P, Vaandrager FW (2012) Automata learning through counterexample guided abstraction refinement. In: Proc. FM 2012, volume 7436 of LNCS, pp 10–27. Springer Aarts F, Howar F, Kuppens H, Vaandrager FW (2014) Algorithms for inferring register automata—a comparison of existing approaches. In: Proc. ISoLA 2014, Part I, volume 8802 of LNCS, pp 202–219. Springer Aarts F, Jonsson B, Uijen J, Vaandrager F (2014) Generating models of infinite-state communication protocols using regular inference with abstraction. Formal Methods Syst Design 46(1):1–41 Aarts F, Kuppens H, Tretmans J, Vaandrager FW, Verwer S (2012) Learning and testing the bounded retransmission protocol. In: Proc. ICGI 2012, volume 21 of JMLR Proceedings, pp 4–18. JMLR.org Angluin D (1987) Learning regular sets from queries and counterexamples. Inf Comput 75(2): 87–106 Reo FA (2004) A channel-based coordination model for component composition. Math Struct Comput Sci 14(3): 329–366 Aarts F, Schmaltz J, Vaandrager FW (2010) Inference and abstraction of the biometric passport. In: Proc. ISoLA 2010, Part I, volume 6415 of LNCS, pp 673–686. Springer Botinčan M, Babić D (2013) Sigma*: symbolic learning of input–output specifications. In: Proc. POPL 2013, pp 443–456. ACM Ball T, Bounimova E, Cook B, Levin V, Lichtenberg J, McGarvey C, Ondrusek B, Rajamani SK, Ustuner A (2006) Thorough static analysis of device drivers. In: Proc. 2006 EuroSys Conf., pp 73–85. ACM Bollig B, Habermehl P, Leucker M, Monmege B (2013) A fresh approach to learning register automata. In: Proc. DLT 2013, volume 7907 of LNCS, pp 118–130. Springer Broy M, Jonsson B, Katoen J-P, Leucker M, Pretschner A (eds) (2004) Model-based testing of reactive systems, volume 3472 of LNCS. Springer, Berlin Berg T, Jonsson B, Raffelt H (2008) Regular inference for state machines using domains with equality tests. In: Proc. FASE, volume 4961 of LNCS, pp 317–331 Bertoli P, Pistore M, Traverso P (2010) Automated composition of web services via planning in asynchronous domains. Artif Intell 174(3-4): 316–361 Clarke E. M, Grumberg O, Peled D (2001) Model checking. MIT Press, Cambridge Cassel S, Howar F, Jonsson B (2015) RALib: a LearnLib extension for inferring efsms. In: DIFTS 2015, Available online: http://www.faculty.ece.vt.edu/chaowang/difts2015/papers/paper_5.pdf. Cassel S, Howar F, Jonsson B, Steffen B (2014) Learning extended finite state machines. In: Proc. SEFM 2014, volume 8702 of LNCS, pp 250–264. Springer Moura L. MD, Bjørner N (2008) Z3: an efficient SMT solver. In: Proc. TACAS 2008, volume 4963 of LNCS, pp 337–340. Springer Ernst MD, Perkins JH, Guo PJ, McCamant S, Pacheco C, Tschantz MS, Xiao C (2007) The Daikon system for dynamic detection of likely invariants. Sci Comput Program 69(1-3): 35–45 Gery E, Harel D, Rhapsody EP (2002) A complete life-cycle model-based development system. In: Proc. IFM 2002, volume 2335 of LNCS, pp 1–10. Springer Groz R, Irfan M-N, Oriat C (2012) Algorithmic improvements on regular inference of software models and perspectives for security testing. In: Proc. ISoLA 2012, Part I, volume 7609 of LNCS, pp 444–457. Springer Giannakopoulou D, Rakamarić Z, Raman V (2012) Symbolic learning of component interfaces. In: Proc. SAS 2012, volume 7460 of LNCS, pp 248–264. Springer, Berlin, Heidelberg Hagerer A, Hungar H, Niese O, Steffen B (2002) Model generation by moderated regular extrapolation. In: Proc. FASE 2002, volume 2306 of LNCS, pp 80–95. Springer Howar F, Isberner M, Steffen B, Bauer O, Jonsson B (2012) Inferring semantic interfaces of data structures. In: Proc. ISoLA 2012, Part I, volume 7609 of LNCS, pp 554–571. Springer Henzinger TA, Jhala R, Majumdar R (2005) Permissive interfaces. In: Proc. ESEC/FSE 2005, pp 31–40. ACM Hungar H, Niese O, Steffen B (2003) Domain-specific optimization in automata learning. In: Proc. CAV 2003, volume 2725 of LNCS, pp 315–327. Springer Howar F (2012) Active learning of interface programs. PhD thesis, Technical University of Dortmund, Germany, 2012 Howar F, Steffen B, Jonsson B, Cassel S (2012) Inferring canonical register automata. In: Proc. VMCAI 2012, volume 7148 of LNCS, pp 251–266. Springer Howar F, Steffen B, Merten M (2011) Automata learning with automated alphabet abstraction refinement. In: Proc. VMCAI 2011, volume 6538 of LNCS, pp 263–277. Springer Huima A (2007) Implementing Conformiq Qtronic. In: Proc. TestCom/FATES 2007, volume 4581 of LNCS, pp 1–12. Springer Isberner M, Howar F, Steffen B (2014) Learning register automata: from languages to program structures. Mach Learn 96(1-2): 65–98 Isberner M, Howar F, Steffen B (2015) The open-source learnlib—a framework for active automata learning. In: Kroening D, Pasareanu CS (eds) Proc. CAV 2015, volume 9206 of LNCS, pp 487–495. Springer Jhala R, Majumdar R (2009) Software model checking. ACM Comput Surv 41(4): 21, 1–21, 54 Lorenzoli D, Mariani L, Pezzè M (2008) Automatic generation of software behavioral models. In: Proc. ICSE 2008, pp 501–510. ACM Maler O, Mens I-E (2014) Learning regular languages over large alphabets. In: Proc. TACAS 2014, volume 8413 of LNCS, pp 485–499. Springer Rivest RL, Schapire RE (1993) Inference of finite automata using homing sequences. Inf Comput 103(2): 299–347 Shu G, Lee D (2007) Testing security properties of protocol implementations—a machine learning based approach. In: Proc. ICDCS 2007, pp 25. IEEE Utting M, Legeard B (2007) Practical model-based testing—a tools approach. Morgan Kaufmann, Burlington Walkinshaw N, Bogdanov K, Derrick J, Paris J (2010) Increasing functional coverage by inductive testing: a case study. In: Proc. ICTSS 2010, volume 6435 of LNCS, pp 126–141. Springer Xiao H, Sun J, Liu Y, Lin S-W, Sun C (2013) Tzuyu: learning stateful typestates. In: Proc. ASE 2013, pp 432–442. IEEE