Towards natural language question generation for the validation of ontologies and mappings

Journal of Biomedical Semantics - Tập 7 - Trang 1-15 - 2016
Asma Ben Abacha1, Julio Cesar Dos Reis2, Yassine Mrabet1, Cédric Pruski1, Marcos Da Silveira1
1Luxembourg Institute of Science and Technology (LIST), Esch-sur-Alzette, Luxembourg
2Institute of Computing, University of Campinas, Campinas, Brazil

Tóm tắt

The increasing number of open-access ontologies and their key role in several applications such as decision-support systems highlight the importance of their validation. Human expertise is crucial for the validation of ontologies from a domain point-of-view. However, the growing number of ontologies and their fast evolution over time make manual validation challenging. We propose a novel semi-automatic approach based on the generation of natural language (NL) questions to support the validation of ontologies and their evolution. The proposed approach includes the automatic generation, factorization and ordering of NL questions from medical ontologies. The final validation and correction is performed by submitting these questions to domain experts and automatically analyzing their feedback. We also propose a second approach for the validation of mappings impacted by ontology changes. The method exploits the context of the changes to propose correction alternatives presented as Multiple Choice Questions. This research provides a question optimization strategy to maximize the validation of ontology entities with a reduced number of questions. We evaluate our approach for the validation of three medical ontologies. We also evaluate the feasibility and efficiency of our mappings validation approach in the context of ontology evolution. These experiments are performed with different versions of SNOMED-CT and ICD9. The obtained experimental results suggest the feasibility and adequacy of our approach to support the validation of interconnected and evolving ontologies. Results also suggest that taking into account RDFS and OWL entailment helps reducing the number of questions and validation time. The application of our approach to validate mapping evolution also shows the difficulty of adapting mapping evolution over time and highlights the importance of semi-automatic validation.

Tài liệu tham khảo

Gruber TR. A translation approach to portable ontology specifications. Knowl Acquis. 1993; 5(2):199–220. Bullinger Angelika. Innovation and Ontologies. In: Structuring the Early Stages of Innovation Management. Wiesbaden: Gabler Verlag: 2009. p. 160–161. 10.1007/978-3-8349-9920-7. Euzenat J, Shvaiko P. Ontology Matching. New York: Springer; 2007. Lambrix P, Strömbäch L, Tan H. Information Integration in Bioinformatics with Ontologies and Standards In: Bry F, Małuszyński J, editors. Semantic Techniques for the Web: the REWERSE Perspective. Berlin, Heidelberg: Springer Berlin Heidelberg: 2009. p. 343–376. doi:10.1007/978-3-642-04581-. Poveda-villalón M, Suárez-figueroa MC, Gómez-pérez A. Validating Ontologies with OOPS!. In: Knowledge Engineering and Knowledge Management, EKAW 2012. Berlin: Springer: 2012. p. 267–81. Sabou M, Fernandez M. Ontology (network) evaluation In: Suárez-Figueroa MC, Gómez-Pérez A, Motta E, Gangemi A, editors. Ontology engineering in a networked world. Springer: 2012. p. 193–212. http://www.springer.com/. Gómez-Pérez A. Ontology evaluation In: Staab S, Studer R, editors. Handbook on Ontologies. Berlin, Heidelberg: Springer Berlin Heidelberg: 2004. p. 251–273. doi:10.1007/978-3-540-24750-. Ben Abacha A, Silveira M, Pruski C. Medical ontology validation through question answering In: Peek N, Marin Morales R, Peleg M, editors. Artificial Intelligence in Medicine. Lecture Notes in Computer Science, vol. 7885. Murcia, Spain: Springer: 2013. p. 196–205. Ben Abacha A, Dos Reis JC, Mrabet Y. Question generation for the validation of mapping adaptation. In: the 6th International Symposium on Semantic Mining in Biomedicine (SMBM 2014), University of Aveiro, Portugal, October: 2014. Yao H, Orme AM, Etzkorn L. Cohesion metrics for ontology design and application. J Comput Sci. 2005; 1(1):107. Baneyx A, Charlet J. Évaluation, évolution et maintenance d’une ontologie en médecine: état des lieux et expérimentations. Revue I3 - Information, Interaction, Intelligence, numéro spécial Corpus et ontologies. 2007:147–173. Stvilia B. A model for ontology quality evaluation. First Monday. 2007; 12(12). Djedidi R, Aufaure MA. ONTO-EVO AL an ontology evolution approach guided by pattern modeling and quality evaluation. In: Proceedings of the 6th International Conference on Foundations of Information and Knowledge Systems. Berlin, Heidelberg: Springer-Verlag: 2010. p. 286–305. 10.1007/978-3-642-11829-6_. Rico M, Caliusco ML, Chiotti O, Galli MR. OntoQualitas: A framework for ontology quality assessment in information interchanges between heterogeneous systems. Computers in Industry. 2014; 65(9):1291–1300. doi:10.1016/j.compind.2014.07. Wolfe JH. Automatic question generation from text — an aid to independent study. 1976; 8(1):104–12. doi:10.1145/952989.803459. SIGCSE–SIGCUE joint symposium on Computer science education Liu M, Calvo RA, Rus V. G-asks: An intelligent automatic question generation system for academic writing support. Dialogue Discourse. 2012; 3(2):101–24. Heilman M. Automatic factual question generation from text. PhD thesis: Carnegie Mellon University; 2011. Mitkov R, Ha LA. Computer-aided generation of multiple-choice tests. In: Proceedings of the HLT-NAACL 2003 Workshop on Building Educational Applications Using NLP. Edmonton, Canada: 2003. p. 17–22. http://clg.wlv.ac.uk/papers/ruslan-NAACL-03.pdf. Accessed Feb 2016. Papasalouros A, Kanaris K, Kotis K. Automatic generation of multiple choice questions from domain ontologies. In: Proceedings of the IADIS International Conference on e-Learning Amsterdam, The Netherlands 22-25 July 2008: 2008. p. 427–34. Cubric M, Tosic M. Towards automatic generation of e-assessment using semantic web technologies. In: Proceedings of the 2010 International Computer Assisted Assessment Conference: 2010. Accessed Feb 2016. Pammer V. Automatic support for ontology evaluation review of entailed statements and assertional effects for OWL ontologies. PhD thesis: Graz University of Technology; March 2010. Teitsma M, Sandberg J, Maris M, Wielinga B. Using an ontology to automatically generate questions for the determination of situations In: Hameurlain A, Liddle SW, Schewe K-D, Zhou X, editors. Database and Expert Systems Applications: 22nd International Conference, DEXA 2011, Toulouse, France, August 29 - September 2, 2011, Proceedings, Part II. Berlin, Heidelberg: Springer Berlin Heidelberg: 2011. p. 456–63. doi:10.1007/978-3-642-23091-. Gangemi A, Catenacci C, Ciaramita M, Lehmann J. Modelling Ontology Evaluation and Validation. In: Proceedings of the 3rd European Conference on The Semantic Web: Research and Applications. Budva, Montenegro: Springer: 2006. p. 140–54. Köhler J, Munn K, Rüegg A, Skusa A, Smith B. Quality control for terms and definitions in ontologies and taxonomies. BMC bioinformatics. 2006; 7(1):212. Verspoor K, Dvorkin D, Cohen KB, Hunter L. Ontology quality assurance through analysis of term transformations. Bioinformatics. 2009; 25(12):77–84. Dimitrova V, Denaux R, Hart G, Dolbear C, Holt I, Cohn AG. Involving domain experts in authoring OWL ontologies In: Sheth A, Staab S, Dean M, Paolucci M, Maynard D, Finin T, Thirunarayan K, editors. The Semantic Web - ISWC 2008: 7th International Semantic Web Conference, ISWC 2008, Karlsruhe, Germany, October 26-30, 2008. Berlin, Heidelberg: Springer Berlin Heidelberg: 2008. p. 1–16. doi:10.1007/978-3-540-88564-. Medelyan O, Witten IH, Divoli A, Broekstra J. Automatic construction of lexicons, taxonomies, ontologies, and other knowledge structures. Wiley Interdiscip Rev: Data Min Knowl Discov. 2013; 3(4):257–79. doi:10.1002/widm.1097. Zhao L, Ren H, Wan J. Automatic ontology construction based on clustering nucleus. Wuhan Univ J Nat Sci. 2015; 20(2):129–33. Vor der Bruck T, Stenzhorn H. Logical Ontology Validation Using an Automatic Theorem Prover. In: Proceedings of the 2010 Conference on ECAI 2010: 19th European Conference on Artificial Intelligence. Lisbon, Portugal: IOS Press: 2010. p. 491–6. Pohl M, Wiltner S, Rind A, Aigner W, Miksch S, Turic T, Drexler F. Patient development at a glance: An evaluation of a medical data visualization In: INTERACT 2011 PartIV, Campos P, Graham N, Jorge J, Nunes N, Palanque P, Winckler M, editors. Vortrag: Human-Computer Interaction - INTERACT 2011, Lissabon. Springer Berlin/Heidelberg: 2011. p. 292–9. Dos Reis JC, Dinh D, Pruski C, Da Silveira M, Reynaud-Delaitre C. Mapping adaptation actions for the automatic reconciliation of dynamic ontologies. In: Proceedings of the 22Nd ACM International Conference on Information & Knowledge Management. New York, NY, USA: ACM: 2013. p. 599–608. doi:10.1145/2505515.2505564. Meilicke C, Stuckenschmidt H, Tamilin A. Reasoning Support for Mapping Revision. J Logic Comput. 2009; 19(5):807–829. doi:10.1093/logcom/exn047. Serpeloni F, Moraes R, Bonacin R. Ontology mapping validation. Int J Web Portals. 2011; 3:1–11. Falconer SM, Noy NF. Interactive techniques to support ontology matching In: Bellahsene Z, Bonifati A, Rahm E, editors. Schema Matching and Mapping. Berlin, Heidelberg: Springer Berlin Heidelberg: 2011. p. 29–51, doi:10.1007/978-3-642-16518-. McCann R, Shen W, Doan A. Matching schemas in online communities: A web 2.0 approach. In: 2008 IEEE 24th International Conference on Data Engineering: 2008. p. 110–9. doi:10.1109/ICDE.2008. Sarasua C, Simperl E, Noy N. Crowdmap: crowdsourcing ontology alignment with microtasks In: Cudré-Mauroux P, Heflin J, Sirin E, Tudorache T, Euzenat J, Hauswirth M, Parreira JX, Hendler J, Schreiber G, Bernstein A, Blomqvist E, editors. The Semantic Web – ISWC 2012: 11th International Semantic Web Conference, Boston, MA, USA, November 11-15, 2012, Proceedings, Part I. Berlin, Heidelberg: Springer Berlin Heidelberg: 2012. p. 525–41. doi:10.1007/978-3-642-35176-. Hearst M. Automatic acquisition of hyponyms from large text corpora. In: Proceedings of the 14th Conference on Computational Linguistics - Volume 2. Stroudsburg, PA, USA: Association for Computational Linguistics: 1992. p. 539–45. doi:10.3115/992133.992154. Cheatham M, Hitzler P. String similarity metrics for ontology alignment In: Alani H, Kagal L, et al., editors. ISWC 2013. LNCS. vol. 8219. Sydney, Australia: Springer: 2013. p. 294–309. Khoo CSG, Na JC, Wang VW, Chan S. Developing an ontology for encoding disease treatment information in medical abstracts. DESIDOC J Libr Inform Technol. 2011; 31(2):103–15. Cohen J. A coefficient of agreement for nominal scales. Educ Psychol Meas. 1960; 20:37–46. Sure Y, Gomez-Perez A, Daelemans W, Reinberger ML, Guarino N, Noy NF. Why evaluate ontology technologies? because it works!. IEEE Intell Syst. 2004; 19(4):74–81. Buitelaar P, Cimiano P, Haase P, Sintek M. Towards linguistically grounded ontologies. In: The Semantic Web: Research and Applications. Lecture Notes in Computer Science, vol. 5554. Berlin Heidelberg: Springer: 2009. p. 111–25.