Querying documents in object databases

Springer Science and Business Media LLC - Tập 1 - Trang 5-19 - 1997
Serge Abiteboul1, Sophie Cluet1, Vassilis Christophides1, Tova Milo2, Guido Moerkotte3, Jérôme Siméon1
1INRIA-Rocquencourt, BP 105, F-78153 Le Chesnay Cedex, France, , FR
2Tel Aviv University, Ramat Aviv, Tel Aviv 69978, Israel, , IL
3Lehrstuhl für Praktische Informatik III, Seminargebäude A5, Universität Mannheim, D-68131 Mannheim, Germany, , DE

Tóm tắt

that consist in grammars annotated with database programs. To query documents, we introduce an extension of OQL, the ODMG standard query language for object databases. Our extension (named OQL-doc) allows us to query documents without a precise knowledge of their structure using in particular generalized path expressions and pattern matching. This allows us to introduce in a declarative language (in the style of SQL or OQL), navigational and information retrieval styles of accessing data. Query processing in the context of documents and path expressions leads to challenging implementation issues. We extend an object algebra with new operators to deal with generalized path expressions. We then consider two essential complementary optimization techniques. We show that almost standard database optimization techniques can be used to answer queries without having to load the entire document into the database. We also consider the interaction of full-text indexes (e.g., inverted files) with standard database collection indexes (e.g., B-trees) that provide important speed-up.