Normalization and optimization of schema mappings

The VLDB Journal - Tập 20 - Trang 277-302 - 2011
Georg Gottlob1, Reinhard Pichler2, Vadim Savenkov2
1Computing Laboratory, Oxford University, Oxford, United Kingdom
2Database and Artificial Intelligence Group, Institute of Information Systems, Vienna University of Technology, Vienna, Austria

Tóm tắt

Schema mappings are high-level specifications that describe the relationship between database schemas. They are an important tool in several areas of database research, notably in data integration and data exchange. However, a concrete theory of schema mapping optimization including the formulation of optimality criteria and the construction of algorithms for computing optimal schema mappings is completely lacking to date. The goal of this work is to fill this gap. We start by presenting a system of rewrite rules to minimize sets of source-to-target tuple-generating dependencies. Moreover, we show that the result of this minimization is unique up to variable renaming. Hence, our optimization also yields a schema mapping normalization. By appropriately extending our rewrite rule system, we also provide a normalization of schema mappings containing equality-generating target dependencies. An important application of such a normalization is in the area of defining the semantics of query answering in data exchange, since several definitions in this area depend on the concrete syntactic representation of the mappings. This is, in particular, the case for queries with negated atoms and for aggregate queries. The normalization of schema mappings allows us to eliminate the effect of the concrete syntactic representation of the mapping from the semantics of query answering. We discuss in detail how our results can be fruitfully applied to aggregate queries.

Tài liệu tham khảo

Afrati, F.N., Kolaitis, P.G.: Answering aggregate queries in data exchange. In: Proceedings PODS’08, pp. 129–138. ACM (2008) Arenas, M., Barceló, P., Fagin, R., Libkin, L.: Locally consistent transformations and query answering in data exchange. In: Proceedings PODS’04, pp. 229–240. ACM (2004) Arenas M., Bertossi L.E., Chomicki J., He X., Raghavan V., Spinrad J.: Scalar aggregation in inconsistent databases. Theor. Comput. Sci. 3(296), 405–434 (2003) Beeri C., Vardi M.Y.: A proof procedure for data dependencies. J. ACM 31(4), 718–741 (1984) Bernstein P.A., Green T.J., Melnik S., Nash A.: Implementing mapping composition. VLDB J. 17(2), 333–353 (2008) Bernstein, P.A., Melnik, S.: Model management 2.0: manipulating richer mappings. In: Proceedings SIGMOD’07, pp. 1–12. ACM (2007) Chandra, A.K., Merlin, P.M.: Optimal implementation of conjunctive queries in relational data bases. In: Proceedings STOC’77, pp. 77–90. ACM Press (1977) Fagin R.: Horn clauses and database dependencies. J. ACM 29(4), 952–985 (1982) Fagin R., Kolaitis P.G., Miller R.J., Popa L.: Data exchange: semantics and query answering. Theor. Comput. Sci. 336(1), 89–124 (2005) Fagin, R., Kolaitis, P.G., Nash A., Popa L.: Towards a theory of schema-mapping optimization. In: Proceedings PODS’08, pp. 33–42. ACM (2008) Fagin R., Kolaitis P.G., Popa L.: Data exchange: getting to the core. ACM Trans. Database Syst. 30(1), 174–210 (2005) Fagin, R., Kolaitis, P.G., Popa, L., Tan, W.-C.: Reverse data exchange: coping with nulls. In: Proceedings PODS ’09, pp. 23–32. ACM (2009) Gottlob, G., Pichler, R., Savenkov, V.: Optimization and normalization of schema mappings. Technical Report DBAI-TR-2011-69, Vienna University of Technology (2011) Halevy, A.Y., Rajaraman, A., Ordille, J. J.: Data integration: the teenage years. In: Proceedings VLDB’06, pp. 9–16. ACM (2006) Hernich, A., Schweikardt, N.: Cwa-solutions for data exchange settings with target dependencies. In: Proceedings PODS’07, pp. 113–122. ACM (2007) Imielinski T., Lipski W. Jr: Incomplete information in relational databases. J. ACM 31(4), 761–791 (1984) Johnson D.S., Klug A.C.: Testing containment of conjunctive queries under functional and inclusion dependencies. J. Comput. Syst. Sci. 28(1), 167–189 (1984) Kolaitis, P.G.: Schema mappings, data exchange, and metadata management. In: Proceedings PODS’05, pp. 61–75. ACM (2005) Lenzerini, M.: Data integration: a theoretical perspective. In: Proceedings PODS’02, pp. 233–246. ACM (2002) Libkin, L.: Data exchange and incomplete information. In: Proceedings PODS’06, pp. 60–69. ACM Press (2006) Libkin, L., Sirangelo, C.: Data exchange and schema mappings in open and closed worlds. In: Proceedings PODS’08, pp. 139–148. ACM (2008) Marnette B., Mecca G., Papotti P.: Scalable data exchange with functional dependencies. PVLDB 3(1), 105–116 (2010) Mecca, G., Papotti, P., Raunich, S.: Core schema mappings. In: Proceedings SIGMOD’09, pp. 655–668 (2009) Pichler, R., Sallinger, E., Savenkov, V.: Relaxed notions of schema mapping equivalence revisited. In: Proceedings ICDT’11, pp. 90–101. ACM (2011) Sagiv Y., Yannakakis M.: Equivalences among relational expressions with the union and difference operators. J. ACM 27(4), 633–655 (1980) ten Cate B., Chiticariu L., Kolaitis P.G., Tan W.C.: Laconic schema mappings: computing the core with sql queries. PVLDB 2(1), 1006–1017 (2009)