Access Support Tree and TextArray: a data structure for XML document storage and retrieval
Proceedings 14th International Conference on Scientific and Statistical Database Management - Trang 155-164
Tóm tắt
The characteristics of XML documents require new ways of storing and querying such documents. Queries on both textual content and structural aspects must be supported efficiently. For this reason, we examined existing work on both document storage approaches and models for querying documents to derive requirements that are essential for the storage of XML documents. As a result of our study, we designed the Access Support Tree and TextArray (AST/TA) data structure. The important idea of the AST/TA data structure is the separation of the (logical) structure of a document from its "visible" text content. The latter is represented as a single contiguous string. At the same time the AST/TA data structure provides a tight integration to guarantee consistent changes. We introduce the AST/TA data structure formally by, its abstraction, namely the AST/TA model and compare requirements of our AST/TA approach with those found in the current literature. Finally, we describe the advantage of the AST/TA model based on the AST/TA design principles.
Từ khóa
#Tree data structures #XML #Information retrieval #Data structures #Content based retrieval #Database languages #Computer science #Internet #Search engines #MergingTài liệu tham khảo
volz, 1996, An OODBMS-IRS Coupling for Structured Documents, Data Engineering Bulletin, 19, 34
2000, World Wide Web Consortium. Document Object Model (DOM) Level 2 Core Specification, Version 1 0 Technical Report REC-DOM-Level-2-Core-20001113 W3C
2000, Extensible Markup Language (XML), Version 1.0 (Second Edition), Technical Report REC-xml-20001006
2001, World Wide Web Consortium. XML Information Set, echnical Report REC-xml-infoset-20011024
2001, World Wide Web Consortium. XML Schema Part 1: Structures, Technical Report PR-xmlschema-I-20010330
2001, World Wide Web Consortium. XQuery 1.0 and XPath 2.0 Data Model, Technical Report WD-query-datamodel-20010607
yeates, 2000, On Tag Insertion and its Com-plexity, In Proceedings of PRICAI 2000 International Workshop on Text and Data Mining, 52
10.1109/ICDE.2000.839412
2000, XML Extender (Administration and Programming)
10.1007/BF01832136
10.1145/263479.263482
scheffner, 2001, Access Support Tree & TextArray: Data Structures for XML Document Storage, Technical Report HUB-IB-157
salminen, 0, PAT Expressions: An Algebra for Text Search, Acta Linguistica Hungarica, 41, 277
heuer, 1999, IRQL - Yet Another Language for Querying Semi-Structured Data?, Technical Report Preprint CS-01–99
1998, Multimedia Data Management
tompa, 1997, Views of Text Digital Media Information Base (DMIB ‘97)