Comprehensive structured knowledge base system construction with natural language presentation
Tóm tắt
Constructing an ontology-based machine-readable knowledge base system from different sources with minimum human intervention, also known as ontology-based machine-readable knowledge base construction (OMRKBC), has been a long-term outstanding problem. One of the issues is how to build a large-scale OMRKBC process with appropriate structural information. To address this issue, we propose Natural Language Independent Knowledge Representation (NLIKR), a method which regards each word as a concept which should be defined by its relations with other concepts. Using NLIKR, we propose a framework for the OMRKBC process to automatically develop a comprehensive ontology-based machine-readable knowledge base system (OMRKBS) using well-built structural information. Firstly, as part of this framework, we propose formulas to discover concepts and their relations in the OMRKBS. Secondly, the challenges in obtaining rich structured information are resolved through the development of algorithms and rules. Finally, rich structured information is built in the OMRKBS. OMRKBC allows the efficient search of words and supports word queries with a specific attribute. We conduct experiments and analyze the results of relational information extraction, with the results showing that OMRKBS had an accuracy of 84% which was higher than the other knowledge base systems, namely ConceptNet, DBpedia and WordNet.
Tài liệu tham khảo
Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S, Morsey M, van Kleef P, Auer S, Bizer C (2015) DBpedia—a large-scale, multilingual knowledge base extracted from Wikipedia. Sem Web J 6(2):167–195
Benferhat S, Dubois D, Prade H (1997) Some syntactic approaches to the handling of inconsistent knowledge bases: a comparative study part 1: the flat case. Studia Logica 58–1:17–45
Hasan KS, Ng V (2014) Automatic keyphrase extraction: a survey of the state of the art. In: Proceedings of the 52nd annual meeting of the association for computational linguistics, pp 1262–1273
Najmi E, Hashmi K, Malik Z, Rezgui A, Khan HU (2014) Conceptonto: an upper ontology based on conceptnet. In: 11th ACS/IEEE international conference on computer systems and applications (AICCSA), Doha, pp 366–372
Zghal HB, Moreno A (2014) system for information retrieval in a medical digital library based on modular ontologies and query reformulation. Multimedia Tools Appl 72–3:2393–2412
Gorskis H, Aleksejeva L, Polaka I (2016) Database analysis for ontology learning. Procedia Comput Sci 102:113–120
Nakhla Z, Nouira K (2017) Automatic approach to enrich databases using ontology: application in medical domain. Procedia Comput Sci 12:387–396
Copestake A (1990) An approach to building the hierarchical element of a lexical knowledge base from a machine readable dictionary. In: Proceedings of the first international workshop on inheritance in natural language processing, Tilburg, The Netherlands, pp 19–29
Ji H, Grishman R (2011) Knowledge base population: successful approaches and challenges. In: Proceedings of the 49th annual meeting of the association for computational linguistics, Human Language Technologies, pp 1148–1158
Navigli R, Ponzetto SP (2012) Babelnet the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif Intell 193:217–250
Speer R, Chin J, Havasi C (2017) Conceptnet 5.5: an open multilingual graph of general knowledge. In: Proceedings of the AAAI conference on artificial intelligence (AAAI), pp 4444–4451
Boas HC (2017) Computational Resources: FrameNet and Constructicon. In: Dancygier B ed. Cambridge handbooks in language and linguistics. Cambridge University Press, pp 549–573. https://doi.org/10.1017/9781316339732.035
Fellbaum C (2012) The encyclopedia of applied linguistics. Wordnet. American Cancer Society, Dordrecht
Wilson MD (1988) Mrc psycholinguistic database: machine usable dictionary (version 2.00). Behav Res Methods Instrum Comput 20–1:6–11
Sanchez D, Moreno A (2004) Recent advances in artificial intelligence research and development. Creating ontologies from web document. IOS Press, New York
Riloff E (1993) Automatically constructing a dictionary for information extraction tasks. In: Proceedings of the 11th national conference on artificial intelligence. AAAI Press, Washington, D.C, pp 811–816
Wu S, Hsiao L, Cheng X, Hancock B, Rekatsinas T, Levis P, R C (2018) Fonduer: knowledge base construction from richly formatted data. In: Proceedings of the 2018 international conference on management of data (SIGMOD ’18), pp 1301–1316
Sa CD, Ratner A et al (2017) Incremental knowledge base construction using deepdive. VLDB J 26:81–105
Glauber R, Claro DB (2018) A systematic mapping study on open information extraction. Expert Syst Appl 112:372–387. https://doi.org/10.1016/j.eswa.2018.06.046
Noy NF, Shah NH, Whetzel PL, Dai B, Dorf M, Griffith M, Rubin DL, Storey MA, Chute CG (2009) Bioportal: ontologies and integrated data resources at the click of a mouse. Nucleic Acids Res 37:170–173
Ah B, Lp B, Lc P, Lc B, Dl S (1996) Taking a bite out of crisp strategies on using and conducting searches in the computer retrieval of information on scientific projects database. Comput Nurs 14–4:218–24
Martinez-Rodriguez Jose L, Ivan Lopez-Arevalo ABR-A (2018) Openie-based approach for knowledge graph construction from text. Expert Syst Appl 113:339–355
Kollia I, Glimm B, Horrocks I (2011) Sparql query answering over owl ontologies. In: Proceedings of the 8th extended semantic web conference on the semantic web: research and applications (ESWC), vol. part 1, pp 382–396
Doing-Harris K, Livnat Y, Meystre S (2015) Automated concept and relationship extraction for the semi-automated ontology management (seam) system. J Biomed Sem 6(1):15
Alobaidi M, Malik KM, Sabra S (2018) Linked open data-based framework for automatic biomedical ontology generation. BMC Bioinform 19(1):319
Qawasmeh O, Lefrançois M, Zimmermann A, Maret P (2018) Improved categorization of computer-assisted ontology construction systems: focus on bootstrapping capabilities
Bast H, Buchhold B, Haussmann E (2016) Semantic search on text and knowledge bases. Found Trends® Inform Retrieval 10:119–271
Khanam SA, Youn HY (2016) A web service discovery scheme based on structural and semantic similarity. J Inform Sci Eng 32–1:153–176
Jaana K (2005) Ontology as a search-tool: a study of real users’ query formulation with and without conceptual support. In: Advances in information retrieval
Amato F, Moscato V, Picariello A, Sperlí G (2017) Kira: a system for knowledge-based access to multimedia art collections. In: 2017 IEEE 11th international conference on semantic computing (ICSC), pp 338–343
Musen AM, Team P (2015) The protégé project: a look back and a look forward. AI Matters 1–4:4–12
Thomas R, Fabian S, Johannes H, Joanna B, Erdal K, Gerhard W (2016) Yago: a multilingual knowledge base from wikipedia, wordnet, and geonames. In: The semantic web–ISWC 2016. Springer, Cham, pp 177–185
Jastrzebski S, Bahdanau D, Hosseini S, Noukhovitch M, Bengio Y, Cheung JCK (2018) Commonsense mining as knowledge base completion? A study on the impact of novelty. CoRR arXiv:abs/1804.09259
Lenat DB (1995) Cyc: a large-scale investment in knowledge infrastructure. Commun ACM 38(11):33–38
Trinh TH, Le QV (2018) A simple method for commonsense reasoning. CoRR arXiv:abs/1806.02847
Young T, Cambria E, Chaturvedi I, Zhou H, Biswas S, Huang M (2018) Augmenting end-to-end dialogue systems with commonsense knowledge. AAAI
Manning CD, Surdeanu M, Bauer J, Finkel J, Inc P, Bethard SJ, Mcclosky D (2014) The stanford corenlp natural language processing toolkit. In: In ACL, system demonstrations
Goldman RS (2018) Structural aspects of constructing meaning from text. In: Kamil PBM, Pearson PD, Barr R eds, M.LHandbook of Reading Research, pp 311–335
Al-Zaidy RA, Giles CL (2018) Extracting semantic relations for scholarly knowledge base construction. In: 2018 IEEE 12th international conference on semantic computing (ICSC). Laguna Hills, pp 56–63
Upadhyay P, Bindal A, Kumar M, Ramanath M (2018) Construction and applications of teknowbase: a knowledge base of computer science concepts. In: Companion proceedings of the the web conference 2018 (WWW), pp 1023–1030
Coronado DS, Haber MW, Sioutos N, Wright LW (2004) Nci thesaurus: using science-based terminology to integrate cancer research results. Medinfo 107:33–37
Manning DC, Surdeanu M, Bauer J, Finkel J, Bethard SJ, McClosky D (2014) The stanford corenlp natural language processing toolkit. In: Proceedings of the 52nd annual meeting of the association for computational linguistics: system demonstrations, pp 55–60
Horridge M, Bechhofer S (2011) The owl api: a Java API for owl ontologies. Semantic Web 2–1:11–21
O’Connor MJ, Halaschek-Wiener C, Musen MA (2010) M2: a language for mapping spreadsheets to owl. In: OWLED
Bailey RW (2004) The meaning of everything: the story of the Oxford english dictionary (review). In: Kamil PBM, Pearson PD, Barr R, eds. Dictionaries, pp 169–174