A methodology to learn ontological attributes from the Web

Data and Knowledge Engineering - Tập 69 - Trang 573-597 - 2010

David Sánchez¹

¹Intelligent Technologies for Advanced Knowledge Acquisition (ITAKA), Departament d’Enginyeria Informàtica i Matemàtiques, Universitat Rovira i Virgili, Avda. Països Catalans, 26. 43007 Tarragona, Spain

Tài liệu tham khảo

A. Almuhareb, M. Poesio, Attribute-based and value-based clustering: an evaluation, in: Proceedings of the Conference on Empirical Methods and Natural Language Proceedings, Barcelona, Spain, 2004, pp. 158–165. A. Almuhareb, M. Poesio, MSDA: Wordsense discrimination using context vectors and attributes, in: Proceedings of European Conference on Artificial Intelligence, 2006, pp. 543–547. A. Borthwick, A Maximum Entropy Approach to Named Entity Recognition, Ph.D. Thesis, New York, 1999. Gómez-Pérez, 2004 Kilgariff, 2007, Googleology is bad science, Computational Linguistics, 3, 147, 10.1162/coli.2007.33.1.147 A. Moreno, D. Riaño, D. Isern, J. Bocio, D. Sánchez, L. Jiménez, Knowledge exploitation from the web, in: Proceedings of the Fifth International Conference on Practical Aspects of Knowledge Management, Vienna, Austria, 2004, pp. 175–185. Pivk, 2007, Transforming arbitrary tables into logical from with TARTAR, Data and Knowledge Engineering, 60, 567, 10.1016/j.datak.2006.04.002 A. Popescu, O. Etzioni, Extracting product features and opinions from reviews, in: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Vancouver, Canada, 2005, pp. 339–346. Tiberino, 2005, Towards ontology generation from tables, World Wide Web: Internet and Information Systems, 8, 261, 10.1007/s11280-005-0360-8 Weichselbraun, 2009, Discovery and evaluation of non-taxonomic relations in domain ontologies, International Journal of Metadata, Semantics and Ontologies, 4, 212, 10.1504/IJMSO.2009.027755 Tao, 2009, Automatic hidden-web table interpretation, conceptualization and semantic annotation, Data and Knowledge Engineering, 68, 683, 10.1016/j.datak.2009.02.010 D. Pinto, A. McCallum, X. Wei, W.B. Croft, Table extraction using conditional random fields, in: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2003, pp. 235–242. D. Sánchez, A. Moreno, Learning non-taxonomic relationships from web documents for domain ontology construction, Data and knowledge Engineering, Elsevier 63(3) (2008) 600–623. D. Sánchez, A. Moreno, Pattern-based automatic taxonomy learning from the Web, AI Communications, IOS Press, 21(1) (2008) 27–48. D. Sánchez, D. Isern, Automatic extraction of acronym definitions from the web, Applied Intelligence, doi: 10.1007/s10489-009-0197-4, 2009. D. Sánchez, D. Isern, A. Rodríguez, A. Moreno, General purpose agent-based parallel computing, in: Proceedings of 10th International Work-Conference on Artificial Neural Networks, Salamanca, Spain, 2009, pp. 231–238. D. Yarowsky, Unsupervised word-sense disambiguation rivalling supervised methods, in: Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, MA, 1995, pp. 189–196. D. Sánchez, A. Moreno, A methodology for knowledge acquisition from the web, International Journal of Knowledge-Based and Intelligent Engineering Systems, IOS Press, 10(6) (2006) 453–475. E. Alfonseca, S. Manandhar, An unsupervised method for general named-entity recognition and automated concept discovery, in: Proceedings of the First International Conference on General WordNet, Mysore, India, 2002. E. Brill, Processing natural language without natural language processing, in: Proceedings of the Fourth International Conference on Computational Linguistics and Intelligent Text Processing, Mexico City, Mexico, 2003, pp. 360–369. Métais, 2002, Enhancing information systems management with natural language processing techniques, Data and Knowledge Engineering, 41, 247, 10.1016/S0169-023X(02)00043-5 F. Wu, D. Weld, Automatically refining the Wikipedia infobox ontology, in: Proceedings of the 17th World Wide Web Conference, Beijing, China, 2008, pp. 635–644. G. Bisson, C. Nedellec, D. Cañamero, Designing clustering methods for ontology building. The Mo’K workbench, in: Proceedings of the Workshop on Ontology Learning, 14th European Conference on Artificial Intelligence, Berlin, Germany, 2000, pp. 13–19. G. Grefenstette, SQLET: short query linguistic expansion techniques: palliating one-word queries by providing intermediate structure to text, in: Proceedings of Information Extraction: A Multidisciplinary Approach to an Emerging Information Technology, Italy, 1997, pp. 97–114. Pirró, 2009, A semantic similarity metric combining features and intrinsic information content, Data and Knowledge Engineering, 68, 1289, 10.1016/j.datak.2009.06.008 Varlamis, 2009, Semantically driven snippet selection for supporting focused web searches, Data and Knowledge Engineering, 68, 261, 10.1016/j.datak.2008.10.002 Dujmovic, 2006, Evaluation and comparison of search engines using the LSP method, Computer Science and Information Systems, 3, 711, 10.2298/CSIS0602031D J. Ferreira da Silva, G.P. Lopes, A local maxima method and a fair dispersion normalization for extracting multi-word units from corpora, in: Proceedings of Sixth Meeting on Mathematics of Language, 1999, pp. 369–381. Hong, 2010, Information extraction for search engines using fast heuristic techniques, Data and Knowledge Engineering, 69, 169, 10.1016/j.datak.2009.10.002 Pustejovsky, 1991, The generative lexicon, Computational Linguistics, 17, 409 J. Reisinger, M. Pasca, Low-cost supervision for multiple-source attribute extraction, in: Proceedings of 10th International Conference on Intelligent Text Processing and Computational Linguistics, 2009, pp. 382–393. J. Surowiecki, The wisdom of crowds: why the many are smarter than the few and how collective wisdom shapes business, Economies, Societies and Nations, Doubleday Books, 2004. K. Dellschaft, S. Staab, On how to perform a gold standard based evaluation of ontology learning, in: Proceedings of the Fifth International Semantic Web Conference, 2006, pp. 228–241. K. Probst, R. Ghani, M. Krema, A. Fano, Y. Liu, Semi-supervised learning of attribute value pairs from product descriptions, in: Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, 2007, pp. 2838–2843. K. Schekotykhin, D. Jannach, G. Friedrich, O. Kozeruk, AllRight: automatic ontology instantiation from tabular web documents, in: Proceedings of the Sixth International Semantic Web Conference and 2nd Asian Semantic Web Conference, Busan, South Korea, 2007, pp. 463–476. K. Tokunaga, J. Kazama, K. Torisawa, Automatic discovery of attribute words from Web document, in: Proceedings of the Second International Joint Conference on Natural Language Processing, Korea, 2005, pp. 106–118. L. Ding, T. Finin, A. Joshi, R. Pan, R.S. Cost, Y. Peng, P. Reddivari, V.C. Doshi, J. Sachs, Swoogle: a search and metadata engine for the semantic web, in: Proceedings of the 13th ACM Conference on Information and Knowledge Management, ACM Press, 2004, pp. 652–659. M. Banko, M.J. Cafarella, S. Soderland, M. Broadhead, O. Etzioni, Open information extraction from the web, in: Proceedings of the 20th International Joint Conference on Artificial Intelligence, Hyderabad, India, 2007, pp. 2670–2676. M. Berland, E. Charniak, Finding parts in very large corpora, in: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, Maryland, USA, 1999, pp. 57–64. M. Cafarella, D. Downey, S. Soderland, O. Etzioni, KnowItNow: fast, scalable information extraction from the web, in: Proceedings of the Human Language Technology Conference, Vancouver, Canada, 2005, pp. 563–570. M. Fleischman, E. Hovy, Fine grained classification of named entities, in: Proceedings of the 19th Conference on Computational Linguistics, 2002, pp. 1–7. M. Hu, B. Liu, Mining and summarizing customer reviews, in: Proceedings of the 10th ACM International Conference on Knowledge Discovery and Data Mining, Seattle, USA, 2004, pp. 168–177. M. Pasca, Acquisition of categorized named entities for web search, in: Proceedings of the 13th ACM International Conference on Information and Knowledge Management, USA, 2004, pp. 137–145. M. Pasca, B. Van Durme, N. Garera, The role of documents vs. queries in extracting class attributes from text, in: Proceedings of Sixteenth Conference on Information and Knowledge Management, Lisboa, Portugal, 2007, pp. 485–494. Ruiz-Casado, 2007, Automatising the learning of lexical patterns: an application to the enrichment of WordNet by extracting semantic relationships from Wikipedia, Data and Knowledge Engineering, 61, 484, 10.1016/j.datak.2006.06.011 M. Sabou, Extracting ontologies from software documentation: a semi-automatic method and its evaluation, in: Proceedings of the ECAI-2004 Workshop on Ontology Learning and Population, Valencia, Spain, 2004. M.A. Hearst, Automatic acquisition of hyponyms from large text corpora, in: Proceedings of the 14th International Conference on Computational Linguistics, 1992, pp. 539–545. Guarino, 1992, Concepts, attributes and arbitrary relations: some linguistic and ontological criteria for structuring knowledge base, Data and Knowledge Engineering, 8, 249, 10.1016/0169-023X(92)90025-7 Kiyavitskaya, 2009, Cerno: light-weight tool support for semantic annotation of textual documents, Data and Knowledge Engineering, 68, 1470, 10.1016/j.datak.2009.07.012 Kobayashi, 2005, Collecting evaluative expressions for opinion extraction, Journal of Natural Language Processing, 12, 203, 10.5715/jnlp.12.3_203 N. Yoshinaga, K. Torisawa, Open-domain attribute value acquisition from semi-structured texts, in: Proceedings of the Sixth International Semantic Web Conference, Workshop on Text to Knowledge: Lexicon/Ontology Interface, Busan, South Korea, 2007, pp. 55–66. Etzioni, 2005, Unsupervised named-entity extraction from the web: an experimental study, Artificial Intelligence, 165, 91, 10.1016/j.artint.2005.03.001 Etzioni, 2005, Unsupervised named-entity extraction from the web: an experimental study, Artificial Intelligence, 165, 91, 10.1016/j.artint.2005.03.001 P. Cimiano, J. Wenderoth, Automatic acquisition of ranked qualia structures from the web, in: Proceedings of the Annual Meeting of the Association for Computational Linguistics, Prague, 2007, pp. 888–895. P. Cimiano, A. Pick, L. Schmidt, S. Staab, Learning taxonomic relations from heterogeneous sources of evidence, in: Proceedings of the ECAI Ontology Learning Workshop, 2004, pp. 59–73. Cimiano, 2006 P. Pantel, M. Pennacchiotti, Espresso: leveraging generic patterns for automatically harvesting semantic relations, in: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, 2006, pp. 113–120. P.D. Turney, Mining the Web for synonyms: PMI–IR versus LSA on TOEFL, in: Proceedings of the 12th European Conference on Machine Learning, Freiburg, Germany, 2001, pp. 491–499. Cilibrasi, 2006, The Google similarity distance, IEEE Transaction on Knowledge and Data Engineering, 19, 370, 10.1109/TKDE.2007.48 Fano, 1961 Girju, 2006, Automatic discovery of part-whole relations, Computational Linguistics, 32, 83 Studer, 1998, Knowledge engineering: principles and methods, IEEE Transactions on Knowledge and Data Engineering, 25, 161, 10.1016/S0169-023X(97)00056-6 R. Zanibbi, D. Blostein, J. Cordy, Decision-based specification and comparison of table recognition algorithms, in: Proceedings of Machine Learning in Document Analysis and Recognition, 2008, pp. 71–103. R.J. Brachman, H.J. Levesque, Reading in Knowledge Representation, California, USA, 1985, pp. 41–70. S. Mohammad, Measuring semantic distance using distributional profiles of concepts, Ph.D. Thesis, University of Toronto, Toronto, Canada, 2008. S. Ravi, M. Pasca, Using structured text for large scale attribute extraction, in: Proceedings of 17th Conference on Information and Knowledge Management, 2008, pp. 1183–1192. S. Schlobach, M. Olsthoorn, M. de Rijke, Type checking in open-domain question answering, in: Proceedings of the European Conference on Artificial Intelligence, 2004, pp. 398–402. T. Berners-lee, J. Hendler, O. Lassila, The semantic web, Scientific American, 2001. T. Chklovski, Y. Gil, An analysis of knowledge collected from volunteer contributions, in: Proceedings of the 20th National Conference on Artificial Intelligence, Pittsburgh, USA, 2005, pp. 564–571. T. Veale, Y. Hao, Comprehending and generating apt metaphors: a web-driven, case-based approach to figurative language, in: Proceedings of AAAI, 2007, pp. 1471–1476. Jans, 2000, The effect of query complexity on web searching results, Information Research, 6 Landauer, 1997, A solution to Plato’s problem: the latent semantic analysis theory of the acquisition, induction, and representation of knowledge, Psychological Review, 104, 211, 10.1037/0033-295X.104.2.211 Di Lecce, 2009, Fingerprinting lexical contexts over the web, Journal of Universal Computer Science, 15, 805 Y.J. An, J. Geller, Y. Wu, S.A. Chun, Semantic deep web: automatic attribute extraction from the deep web data sources, in: Proceedings of the ACM Symposium on Applied Computing, 2007, pp. 1667–1672.

Scholar Hub - Công cụ hỗ trợ trích dẫn và phân tích khoa học Việt Nam

Scholar Hub là công cụ hỗ trợ trích dẫn và phân tích ảnh hưởng của các bài báo, công bố khoa học Việt Nam và Quốc tế.
ScholarHub KHÔNG đăng thông tin tổng hợp, KHÔNG đăng lại nội dung từ các trang báo chí Việt Nam hoặc trang thông tin điện tử khác tại Việt Nam.

Thông tin, cập nhật

Đăng ký Tạp chí tham gia Scholar Hub

Phản hồi ý kiến về Scholar Hub

Bài viết, nội dung cập nhật

Chủ đề khoa học

Website liên kết

Hệ thống CSDL Khoa học & Công nghệ SciBase

Phần mềm kiểm tra trùng lặp Kiểm Tra Tài Liệu

Phần mềm xuất bản tạp chí điện tử VOJS

Hệ thống hội thảo khoa học Việt Nam

Nền tảng trắc nghiệm và đề thi đa lĩnh vực LetQA

Thông tin liên hệ & hỗ trợ

Đơn vị chủ quản, phát triển và vận hành: Công ty Cổ phần Metis

Địa chỉ liên hệ: 26A Lê Đức Thọ, Phường Từ Liêm, Thành phố Hà Nội

Số giấy chứng nhận ĐKKD: 0109293202 cấp ngày 03/08/2020 tại Sở Kế hoạch và Đầu tư thành phố Hà Nội

Người quản lý và chịu trách nhiệm nội dung: Nguyễn Ngọc Sơn

Hotline: 0566.685.688

Email: [email protected]