A multi-colony ant algorithm for optimizing join queries in distributed database systems
Tóm tắt
Distributed database systems provide a new data processing and storage technology for decentralized organizations of today. Query optimization, the process to generate an optimal execution plan for the posed query, is more challenging in such systems due to the huge search space of alternative plans incurred by distribution. As finding an optimal execution plan is computationally intractable, using stochastic-based algorithms has drawn the attention of most researchers. In this paper, for the first time, a multi-colony ant algorithm is proposed for optimizing join queries in a distributed environment where relations can be replicated but not fragmented. In the proposed algorithm, four types of ants collaborate to create an execution plan. Hence, there are four ant colonies in each iteration. Each type of ant makes an important decision to find the optimal plan. In order to evaluate the quality of the generated plan, two cost models are used—one based on the total time and the other on the response time. The proposed algorithm is compared with two previous genetic-based algorithms on chain, tree and cyclic queries. The experimental results show that the proposed algorithm saves up to about 80 % of optimization time with no significant difference in the quality of generated plans compared with the best existing genetic-based algorithm.
Tài liệu tham khảo
Alamery M, Faraahi A, Javadi H et al (2010) Multi-join query optimization using the bees algorithm. In: De Leon F, De Carvalho A, Rodríguez-González S, De Paz Santana J, Rodríguez J (eds) Distributed computing and artificial intelligence. Springer, Berlin, pp 449–457. doi:10.1007/978-3-642-14883-5_58
Aljanaby A, Abuelrub E, Odeh M (2005) A survey of distributed query processing. Int Arab J Inf Technol 2:48–57
Babb E (1979) Implementing a relational database by means of specialized hardware. ACM Trans Database Syst 4:1–29. doi:10.1145/320064.320065
Bernstein PA, Chiu D-MW (1981a) Using semi-joins to solve relational queries. J. ACM 28:25–40. doi:10.1145/322234.322238
Bernstein PA, Goodman N, Wong E et al (1981b) Query processing in a system for distributed databases (SDD-1). ACM Trans Database Syst 6:602–625. doi:10.1145/319628.319650
Bonabeau E, Dorigo M, Theraulaz G (1999) Swarm intelligence: from natural to artificial systems. Oxford University Press Inc., Oxford
Chen MS, Yu PS (1993) Combining join and semi-join operations for distributed query processing. IEEE Trans Knowl Data Eng 5:534–542
Connolly TM, Begg CE (2005) Database systems: a practical approach to design, implementation and management, 4th edn. Addison-Wesley, USA
Date CJ (2004) An introduction to database systems, 8th edn. Addison-Wesley, USA
Dorigo M, Maniezzo V, Colorni A (1996) Ant system: optimization by a colony of cooperating agents. IEEE Trans Syst Man Cybern Part B Cybern 26:29–41
Dorigo M, Stuzle T (2004) Ant colony optimization. MIT Press, Cambridge
Garcia-Molina H, Ullman JD, Jennifer W (2002) Database systems, the complete book. Prentice Hall, USA
Goli M, Rouhani Rankoohi SMT (2011) A new vertical fragmentation algorithm based on ant collective behavior in distributed database systems. Knowl Inf Syst 30:435–455. doi:10.1007/s10115-011-0384-6
Gorla N, Song S-K (2010) Subquery allocations in distributed databases using genetic algorithms. J Comput Sci Technol 10:31–37
Hameurlain A (2009) Evolution of query optimization methods: from centralized database systems to data grid systems. Proceedings of the 20th international conference on database and expert systems applications. Springer-Verlag, Linz, Austria
Ibaraki T, Kameda T (1984) On the optimal nesting order for computing \(N\)-relational joins. ACM Trans Database Syst 9:482–502. doi: 10.1145/1270.1498
Ioannidis YE (1996) Query optimization. ACM Comput Surv 28:121–123
Karimi Adl R, Rouhani Rankoohi SMT (2009) A new ant colony optimization based algorithm for data allocation problem in distributed databases. Knowl Inf Syst 20:349–373. doi:10.1007/s10115-008-0182-y
Kossmann D (2000) The state of the art in distributed query processing. ACM Comput Surv 32:422–469. doi:10.1145/371578.371598
Lanzelotte RSG, Valduriez P, ZaïT M et al (1994) Industrial-strength parallel query optimization: Issues and lessons. Inf Syst 19:311–330. doi:10.1016/0306-4379(94)90017-5
Li N, Liu Y, Dong Y et al (2008) Application of ant colony optimization algorithm to multi-join query optimization. Proceedings of the 3rd international symposium on advances in computation and intelligence. Springer-Verlag, Wuhan, China
Li Z, Ross KA (1995) PERF join: an alternative to two-way semijoin and bloomjoin. Proceedings of the fourth international conference on information and knowledge management, Baltimore, Maryland, United States
Narendra KS, Thathachar MAL (1974) Learning automata-a survey. IEEE Trans Syst Man Cybern SMC–4:323–334
Ozsu MT, Valduriez P (2011) Principles of distributed database systems, 3rd edn. Springer, USA
Rho S, March ST (1997) Optimizing distributed join queries: A genetic algorithm approach. Ann Oper Res 71:199–228. doi:10.1023/a:1018967414664
Ribeiro CC, Ribeiro CD, Lanzelotte RSG (1997) Query optimization in distributed relational databases. J Heuristics 3:5–23. doi:10.1023/a:1009670031749
Roussopoulos N, Kang H (1991) A pipeline \(N\)-way join algorithm based on the 2-way semijoin program. IEEE Trans Knowl Data Eng 3:486–495
Sevinç E, CosşAr A (2011) An evolutionary genetic algorithm for optimization of distributed database queries. Comput J 54:717–725
Shah-Hosseini H (2009) The intelligent water drops algorithm: a nature inspired swarm-based optimization algorithm. Int J Bio-Inspired Comput 1:71–79
Shahabi C, Khan L, Mcleod D (2000) A probe-based technique to optimize join queries in distributed internet databases. Knowl Inf Syst 2:373–385
St T, Tzle Hoos HH (2000) MAX-MIN ant system. Future Gener Comput Syst 16:889–914
Unnava V (1992) Query processing in distributed database. PhD Thesis, The Ohio state university
Yu CT, Chang CC (1984) Distributed query processing. ACM Comput Surv 16:399–433
Yu CT, Guh KC, Brill D et al (1989) Partition strategy for distributed query processing in fast local networks. IEEE Trans Softw Eng 15:780–793
Yu CT, Keh-Chang G, Weining Z et al (1987) Algorithms to process distributed queries in fast local networks. IEEE Trans Comput C–36:1153–1164
Yu MJ, Sheu PCY (1997) Adaptive join algorithms in dynamic distributed databases. Distrib Parallel Databases 5:5–30