TiQi: answering unstructured natural language trace queries

Springer Science and Business Media LLC - Tập 20 - Trang 215-232 - 2015
Piotr Pruski1, Sugandha Lohar1, William Goss1, Alexander Rasin1, Jane Cleland-Huang1
1DePaul University, Chicago, USA

Tóm tắt

Software traceability is a required element in the development and certification of safety-critical software systems. However, trace links, which are created at significant cost and effort, are often underutilized in practice due primarily to the fact that project stakeholders often lack the skills needed to formulate complex trace queries. To mitigate this problem, we present a solution which transforms spoken or written natural language queries into structured query language (SQL). TiQi includes a general database query mechanism and a domain-specific model populated with trace query concepts, project-specific terminology, token disambiguators, and query transformation rules. We report results from four different experiments exploring user preferences for natural language queries, accuracy of the generated trace queries, efficacy of the underlying disambiguators, and stability of the trace query concepts. Experiments are conducted against two different datasets and show that users have a preference for written NL queries. Queries were transformed at accuracy rates ranging from 47 to 93 %.

Tài liệu tham khảo

Ali N, Guéhéneuc Y-G, Antoniol G (2013) Trustrace: mining software repositories to improve the accuracy of requirement traceability links. IEEE Trans Softw Eng 39(5):725–741 Androutsopoulos I, Ritchie G (2000) Database interfaces. In: Handbook of natural language processing, Marcel Dekker Inc. pp 209–240 Cleland-Huang J, Gotel O, Hayes JH, Mäder P, Zisman A (2014) Software traceability: trends and future directions. In: Proceedings of the on future of software engineering, FOSE 2014, Hyderabad, India, May 31–June 7, 2014, pp 55–69 Cleland-Huang J, Heimdahl MPE, Hayes JH, Lutz RR, Maeder P (2012) Trace queries for safety requirements in high assurance systems. In: Proceedings of requirements engineering: foundation for software quality–18th international working conference, REFSQ 2012, Essen, Germany, March 19–22, 2012, pp 179–193 Collaborative lifecycle management: design, test, analyze, develop, deliver, integrated by design, Accessed 9/6/2013 Cormen T, Leiserson C, Rivest R, Stein C (eds) Introduction to algorithms (2nd ed.). MIT Press and McGrawHill (2001) Czauderna A, Cleland-Huang J, Çinar M, Berenbach B (2012) Just-in-time traceability for mechatronics systems. In: Second IEEE international workshop on requirements engineering for systems, services, and systems-of-systems, RESS 2012, Chicago, IL, September 25, 2012, pp 1–9 Frost RA, Amour BS, Fortier RJ (2013) An event based denotational semantics for natural language queries to data represented in triple stores. In: 2013 IEEE seventh international conference on semantic computing, Irvine, CA, September 16–18, 2013, pp 142–145 Göker MH, Thompson CA, Arajärvi S, Hua K (2007) Connecting people with questions to people with answers. KI 21(4):23–26 Gotel O, Cleland-Huang J, Huffman Hayes J, Zisman A, Egyed A, Grnbacher P, Dekhtyar A, Antoniol G, Maletic J, Mder P (2012) Traceability fundamentals. In: Cleland-Huang J, Gotel O, Zisman A (eds) Software and systems traceability. Springer, London, pp 3–22 Gotel OCZ, Finkelstein A (1994) An analysis of the requirements traceability problem. In: Proceedings of the first IEEE international conference on requirements engineering, ICRE ’94, Colorado Springs, Colorado, April 18–21, 1994, pp 94–101 Gouvêa E, Moreno-Daniel A, Reddy A, Chengalvarayan R, Thomson DL, Ljolje A (2013) The at&t speech API: a study on practical challenges for customized speech to text service. In: INTERSPEECH 2013, 14th annual conference of the international speech communication association, Lyon, France, August 25–29, 2013, pp 2071–2073 Guerra E, de Lara J, Kolovos D, Paige R (2010) Inter-modelling: from theory to practice. In: Petriu D, Rouquette N, Haugen Ø (eds) Model driven engineering languages and systems, volume 6394 of lecture notes in computer science. Springer, Berlin, pp 376–391. doi:10.1007/978-3-642-16145-2 Hayes JH, Dekhtyar A, Sundaram SK, Holbrook EA, Vadlamudi S, April A (2007) Requirements tracing on target (retro): improving software maintenance through traceability recovery. ISSE 3(3):193–202 Huang X, Baker J, Reddy R (2014) A historical perspective of speech recognition. Commun ACM 57(1):94–103 Jaakkola H, Thalheim B (2003) Visual sql: high-quality er-based query treatment. In: Jeusfeld M, Pastor Ó (eds) Conceptual modeling for novel application domains, volume 2814 of lecture notes in computer science. Springer, Berlin, pp 129–139. doi:10.1007/978-3-540-39597-3 Jarke M, Krause J, Vassiliou Y (1986) Studies in the evaluation of a domain-independent natural language query system. In: Cooperative interfaces to information systems. Springer, pp 101–130 Jarke M, Krause J, Vassiliou Y, Stohr EA, Turner JA, White NH (1985) Evaluation and assessment of a domain-independent natural language query system. IEEE Database Eng Bull 8(3):34–44 Kim H-J, Korth HF, Silberschatz A (1988) PICASSO: a graphical query language. Softw Pract Exp 18:169–203 Lempia DL, Miller SP (2009) Requirements engineering management handbook. National Technical Information Service (NTIS) Li Y, Yang H, Jagadish HV (2006) Term disambiguation in natural language query for XML. In: Flexible query answering systems, 7th International Conference, FQAS 2006, Milan, Italy, June 7–10, 2006, Proceedings, pp 133–146 Lin J, Lin CC, Cleland-Huang J, Settimi R, Amaya J, Bedford G, Berenbach B, Khadra OB, Duan C, Zou X (2006) Poirot: a distributed tool supporting enterprise-wide automated traceability. In: 14th IEEE international conference on requirements engineering (RE 2006), 11–15 September 2006. Minneapolis/St. Paul, Minnesota, pp 356–357 Lucia AD, Fasano F, Oliveto R, Tortora G (2010) Fine-grained management of software artefacts: the ADAMS system. Softw Pract Exp 40(11):1007–1034 Mäder P, Cleland-Huang J (2013) A visual language for modeling and executing traceability queries. Softw Syst Model 12(3):537–553 Mäder P, Jones PL, Zhang Y, Cleland-Huang J (2013) Strategic traceability for safety-critical projects. IEEE Softw 30(3):58–66 Maletic JI, Collard ML (2009) Tql: a query language to support traceability. In: TEFSE ’09: Proceedings of the 2009 ICSE workshop on traceability in emerging forms of software engineering. Washington, DC, IEEE Computer Society, pp 16–20 Meng HH, Siu KC (2002) Semiautomatic acquisition of semantic structures for understanding domain-specific natural language queries. IEEE Trans Knowl Data Eng 14(1):172–181 Miller GA, Fellbaum C (2007) Wordnet then and now. Lang Resour Eval 41(2):209–214 Minock M (2010) C-phrase: a system for building robust natural language interfaces to databases. Data Knowl Eng 69(3):290–302 Popescu A-M, Etzioni O, Kautz HA (2003) Towards a theory of natural language interfaces to databases. In: IUI, pp 149–157 Post from HP Quality Center Support Forum (2009) http://h30499.www3.hp.com/t5/ITRC-Quality-Center-Forum/traceability-query/m-p/4505026., Accessed 1/18/2015 Pruski P, Lohar S, Aquanette R, Ott G, Amornborvornwong S, Rasin A, Cleland-Huang J (2014) Tiqi: Towards natural language trace queries. In: IEEE 22nd international requirements engineering conference, RE 2014, Karlskrona, Sweden, August 25–29, 2014, pp 123–132 Rempel P, Mäder P, Kuschke T, Cleland-Huang J (2014) Mind the gap: assessing the conformance of software traceability to relevant guidelines. In: 36th international conference on software engineering (ICSE) Rempel P, Mäder P, Kuschke T, Philippow I (2013) Requirements traceability across organizational boundaries - a survey and taxonomy. In: Proceedings of requirements engineering: foundation for software quality: 19th international working conference, REFSQ 2013, Essen, Germany, April 8–11, 2013, pp 125–140 Shwartz SP (1982) Problems with domain-independent natural language database access systems. In: Proceedings of the 20th annual meeting on association for computational linguistics, ACL ’82, Stroudsburg, PA. Association for Computational Linguistics,pp 60–62 Silveira N, Dozat T, de Marneffe M, Bowman SR, Connor M, Bauer J, Manning CD (2014) A gold standard dependency corpus for english. In: Proceedings of the ninth international conference on language resources and evaluation (LREC-2014), Reykjavik, Iceland, May 26–31, 2014, pp 2897–2904 Störrle H (2011) VMQL: a visual language for ad-hoc model querying. J Vis Lang Comput 22(1):3–29 Vassiliou Y, Jarke M, Stohr E, Turner J, White N (1983) Natural language for database queries: a laboratory study. MIS Q 7(4):47–61 Winkler S, von Pilgrim J (2010) A survey of traceability in requirements engineering and model-driven development. Softw Syst Model 9(4):529–565 Zhang Y, Witte R, Rilling J, Haarslev V (2006) An ontology-based approach for the recovery of traceability links. In: 3rd International workshop on metamodels, schemas, grammars, and ontologies for reverse engineering (ATEM 2006), Genoa, Italy, October 1st Zloof M. Query by example. In: Proceedings of the NCC. AFIPS