Restricting skyline sizes using weak Pareto dominance

Informatik Forschung und Entwicklung - Tập 21 - Trang 165-178 - 2007
Wolf Tilo Balke1, Ulrich Güntzer2, Wolf Siberski1
1L3S Research Center, Hannover, Germany
2Institut für Informatik, Universität Tübingen, Tübingen, Germany

Tóm tắt

Skyline queries have recently received a lot of attention due to their intuitive query formulation: users can state preferences with respect to several attributes. Unlike numerical or score-based preferences, preferences over discrete value domains do not show an inherent total order, but have to rely on partial orders as stated by the user. In such orders typically many object values are incomparable, increasing the size of skyline sets significantly, and making their computation expensive. In this paper we explore how to enable interactive tasks like query refinement or relevance feedback by providing interesting subsets of the full Pareto skyline, which give users a good overview over the skyline. To be practical these subsets have to be small, efficient to compute, suitable for higher numbers of query predicates, and representative. The key to improved performance and reduced result set sizes is the relaxation of Pareto semantics to the concept of weak Pareto dominance. We argue that this relaxation yields intuitive results and show how it opens up the use of efficient and scalable query processing algorithms. We first derive the complete skyline subset given by weak Pareto dominance called ‘restricted skyline’ and then considering the individual performance of objects limit this further to a subset called ‘focused skyline’. Assessing the practical impact our experiments show that our approach indeed leads to lean result set sizes and outperforms Pareto skyline computations by up to two orders of magnitude.

Tài liệu tham khảo

Balke WT, Güntzer U (2004) Multi-objective Query Processing for Database Systems. In: Proc Int Conf on Very Large Databases (VLDB), Toronto, Canada Balke WT, Güntzer U (2005) Efficient Skyline Queries under Weak Pareto Dominance. In: Proc IJCAI-05 Multidisciplinary Workshop on Advances in Preference Handling (PREFERENCE), Edinburgh, UK Balke WT, Güntzer U, Siberski W (2006) Exploiting Indifference for Customization of Partial Order Skylines. In: Proc of the Int Database Engineering & Applications Symposium (IDEAS 2006), Delhi, India Balke WT, Güntzer U, Siberski W (2007) Getting Prime Cuts from Skylines over Partially Ordered Domains. In: Proc of the GI-Fachtagung für Datenbanksysteme in Business, Technologie und Web (BTW 2007), Aachen, Germany Balke WT, Güntzer U, Zheng J (2004) Efficient Distributed Skylining for Web Information Systems. In: Proc of the Int Conf on Extending Database Technology (EDBT), LNCS 2992, Heraklion, Crete, Greece Balke WT, Wagner M (2004) Through Different Eyes – Assessing Multiple Conceptual Views for Querying Web Services. In: Proc of the Int World Wide Web Conference (WWW 2004), New York, USA, ACM Balke WT, Zheng J, Güntzer U (2005) Approaching the Efficient Frontier: Cooperative Database Retrieval Using High-Dimensional Skylines. In: Proc of the Int Conf on Database Systems for Advanced Applications (DASFAA), Beijing, China Bentley J, Kung H, Schkolnick M, Thompson C (1978) On the Average Number of Maxima in a Set of Vectors and Applications. J ACM 25(4):536–543 Börzsönyi S, Kossmann D, Stocker K (2001) The Skyline Operator. In: Proc of the Int Conf on Data Engineering (ICDE), Heidelberg, Germany Chan C, Eng P, Tan K (2005) Stratified Computation of Skylines with Partially Ordered Domains. In: Proc of the Int Conf on Management of Data (SIGMOD), Baltimore, MD, USA Chomicki J (2002) Querying with Intrinsic Preferences. In: Proc of the Int Conf on Extending Database Technology (EDBT), LNCS 2287, Prague, Czech Republic Chomicki J (2003) Preference Formulas in Relational Queries. ACM Trans Database Syst 28(4):427–466 Fagin R, Lotem A, Naor M (2001) Optimal Aggregation Algorithms for Middleware. In: ACM Symp on Principles of Database Systems (PODS), Santa Barbara, USA Fishburn P (1999) Preference Structures and their Numerical Representations. Theor Comput Sci 217:359–383 Güntzer U, Balke WT, Kießling W (2000) Optimizing Multi-Feature Queries for Image Data-bases. In: Proc of the Int Conf on Very Large Databases (VLDB), Cairo, Egypt Huang X, Jensen C (2004) In-Route Skyline Querying for Location-Based Services. In: Proc of the Int Workshop on Web and Wireless Geographical Information Systems (W2GIS), Goyang, Korea Kießling W (2002) Foundations of Preferences in Database Systems. In: Proc of the Int Conf on Very Large Databases (VLDB), Hong Kong, China Kießling W (2005) Preference Queries with SV-Semantics. In: Proc of the Int Conf on Management of Data (COMAD), Goa, India Köhncke B, Balke WT (2006) Personalized Digital Item Adaptation in Service-Oriented Environments. In: Proc of the Int Workshop on Semantic Media Adaptation and Personalization (SMAP 2006), Athens, Greece Koltun V, Papadimitriou C (2005) Approximately Dominating Representatives. In: Proc of the Int Conf on Database Theory (ICDT), Edinburgh, UK Kossmann D, Ramsak F, Rost S (2002) Shooting Stars in the Sky: An Online Algorithm for Skyline Queries. In: Proc of the Int Conf on Very Large Data Bases (VLDB), Hong Kong, China Lacroix M, Lavency P (1987) Preferences: Putting more Knowledge into Queries. In: Proc of the Int Conf on Very Large Databases (VLDB), Brighton, UK Motro A (1988) VAGUE: A User Interface to Relational Databases that Permits Vague Queries. ACM Trans Inf Syst 6(3):187–214 Papadias D, Tao Y, Fu G et al. (2003) An Optimal and Progressive Algorithm for Skyline Queries. In: Proc of the Int ACM SIGMOD Conf (SIGMOD’03), San Diego, USA Tan KL, Eng PK, Ooi BC (2001) Efficient Progressive Skyline Computation. In: Proc of Conf on Very Large Data Bases (VLDB’01), Rome, Italy