Generating metadata from web documents: a systematic approach
Tóm tắt
In this paper, a mechanism generating RDF Semantic Web schema from Web document set as the semantic metadata is proposed. Analyzing both the structural and un-structural content of Web documents, semi-structured Web documents can be conceptualized as resource objects with inter-relationships in RDF diagram. Technically, hyperlinks, basic annotations, and keywords in web documents will be properly analyzed, and corresponding RDF schema will be generated following the mechanism and rules proposed in this paper. It is expected that with the semantic metadata of document sets on the Web being systematically translated instead of manually edited, the semantic operation on the Web, such as semantic query or semantic search, will be possible in the future.
Tài liệu tham khảo
Semantic Web. 2010. http://www.w3.org/standards/semanticWeb
Raimbault T: Overviewing the RDF(S) Semantic Web. Proceedings of International Conference on Computational Intelligence and Software Engineering (CiSE 2010). Wuhan, China: IEEE Press; 2010:1–4.
Resource Description Framework (RDF). 2004. http://www.w3.org/RDF
Open Graph (Facebook Developers). 2012. https://developers.facebook.com/docs/opengraph
About Microformats. 2012. http://microformats.org/about
HTML5.1 Nightly: A vocabulary and associated APIs for HTML and XHTML. 2011. http://www.w3.org/html/wg/drafts/html/master/Overview.html
Extractiv Project. 2011. http://www.extractiv.com/
Mukhopadhyay D, Kumar R, Majumdar S, Sinha S: A New Semantic Web Services to Translate HTML Pages to RDF. Proceedings of 10th International Conference on Information Technology (ICIT 2007). Orissa, India: IEEE Press; 2007:292–294.
Brin S, Page L: The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems 1998, 30(1–7):107–117.
Decker S, Mitra P, Melnik S: Framework for the Semantic Web: An RDF Tutorial. IEEE Internet Computing 2000, 4(6):68–73. 10.1109/4236.895018
Agarwal PR: Semantic Web in Comparison to Web 2.0. Proceedings of 3rd International Conference on Intelligent Systems, Modelling and Simulation (ISMS). Kota_Kinabalu, Malaysia: IEEE Press; 2012:558–563.
Finin T, Ding L, Pan R, Joshi A, Kolari P, Java A, Peng Y: Swoogle: Searching for knowledge on the Semantic Web. Proceedings of the 20th national conference on Artificial intelligence (AAAI 2005). Pittsburgh, Pennsylvania, USA: AAAI Press; 2005:1682–1683.
Web Ontology Language (OWL). 2004. http://www.w3.org/2004/OWL
Oren E: Sindice.com: A Document-oriented Lookup Index for Open Linked Data. International Journal of Metadata, Semantics and Ontologies 2008, 3(1):37–52. 10.1504/IJMSO.2008.021204
SPARQL Query Language for RDF. 2008. http://www.w3.org/TR/rdf-sparql-query
Jiang H, Ju L, Xu Z: Upgrading the relational database to the Semantic Web with Hibernate. Proceedings of International Conference on Web Information Systems and Mining (WISM 2009). Shanghai, China: IEEE Press; 2009:227–230.
Chen Y, Yang X, Yin K, Ho A: Migrating Traditional Database-based Systems onto Semantic Layer, Proceedings of International Conference on Computer Science and Software Engineering (CSSE 2008), 4. Wuhan, Hubei, China: IEEE Press; 2008:672–676.
Krishna M: Retaining Semantics in Relational Databases by Mapping them to RDF. Proceedings of the 2006 IEEE/WIC/ACM international conference on Web Intelligence and Intelligent Agent Technology (WI-IAT 2006). Hong Kong, China: IEEE Press; 303–306.
de Laborda C: Bringing Relational Data into the Semantic Web using SPARQL and Relational OWL. Proceedings of the 22nd International Conference on Data Engineering Workshops (ICDE 2006). Atlanta, GA, USA: IEEE Press; 2006:55.
Bizer C: D2RQ - Treating Non-RDF Databases as Virtual RDF Graphs. Proceedings of the 3rd International Semantic Web Conference (ISWC2004). Hiroshima, Japan; 2004.
Gu Y, Dan L: Web resources description model based on RDF. Proceedings of 2010 International Conference on Computer Application and System Modeling (ICCASM 2010), pp V9–222-V9–225. 2010.
RDFa1.1 Primer: Rich Structured Data Markup for Web Documents. 2008. http://www.w3.org/TR/xhtml-rdfa-primer/
Nakane F, Otsubo M, Hijikata Y, Nishida S: A basic study on attribute name extraction from the Web. Proceedings of IEEE International Conference on Systems, Man and Cybernetics (SMC 2008). Singapore: IEEE Press; 2008:2161–2166.
Jin Y, Lin Z, Lin H: The Research of Search Engine Based on Semantic Web. Proceedings of International Symposium on Intelligent Information Technology Application Workshops (IITAW 2008). Shanghai, China: IEEE Press; 2008:360–363.
Priebe T, Schlager C, Pernul G: A Search Engine for RDF Metadata. Proceedings of 15th International Workshop on Database and Expert Systems Applications (DEXA 2004). Zaragoza, Spain: IEEE Press; 168–172.
XQuery 1.0: An XML Query Language. Second edition. 2011. http://www.w3.org/TR/xquery
Rich snippets (microdata, microformats, and RDFa). 2012. http://support.google.com/webmasters/bin/answer.py?hl=en&answer=99170