Linked-tree: An aggregate query algorithm based on sliding window over data stream
Tóm tắt
How to process aggregate queries over data streams efficiently and effectively have been becoming hot research topics in both academic community and industrial community. Aiming at the issues, a novel Linked-tree algorithm based on sliding window is proposed in this paper. Due to the proposal of concept area, the Linked-tree algorithm reuses many primary results in last window and then avoids lots of unnecessary repeated comparison operations between two successive windows. As a result, execution efficiency of MAX query is improved dramatically. In addition, since the size of memory is relevant to the number of areas but irrelevant to the size of sliding window, memory is economized greatly. The extensive experimental results show that the performance of Linked-tree algorithm has significant improvement gains over the traditional SC (Simple Compared) algorithm and Ranked-tree algorithm.
Tài liệu tham khảo
Babcock B, Babu S, Datar M,et al. Models and Issues in Data Stream Systems [C]//Proc of the 21st ACM Symposium on Principles of Database Systems. Madison, Wisconsin: ACM Press. 2002:1–16.
Alon N, Matias Y, Szegedy M. The Space Complexity of Approximating the Frequency Moments [C]//Proc of the 28th ACM Symposium on Theory of Computing. Philadelphia, Pennsylvania: ACM Press, 1996:20–29.
Flajolet P, Martin G. Probabilistic Counting [C]//Proc of the 24th IEEE Symposium on Foundations of Computer Science. San Jose: IEEE Computer Society Press, 1983.
Acharya S, Gibbons P B, Poosala V. Congressional Samples for Approximate Answering of Group-by Queries [C]//Proc of the 2000 ACM SIGMOD Intl Conf on Management of Data. Dalas, TX: ACM Press, 2000:487–498.
Acharya S, Gibbons P B, Poosala V,et al. Join Synopses for Approximate Query Answering [C]//Proc of the 1999 ACM SIGMOD Intl Conf on Management of Data. Philadelphia, Pennsylvania: ACM Press, 1999:275–286.
Chaudhuri S, Motwani R, Narasayya V. On Random Sampling over Joins [C]//Proc of the 1999 ACM SIGMOD Intl Conf On Management of Data. Philadelphia, Pennsylvania: ACM Press, 1999:263–274.
Ioannidis Y E, Poosala V. Histogram-Based Approximation of Set-Valued Query-Answers [C]//Proc of the 25th Intl Conf on Very Large Data Bases. Edinburgh, Scotland: Morgan Kaufmann Press, 1999:174–185.
Chakrabarti K K, Garofalakis M N, Rastogi R,et al. Approximate Query Processing Using Wavelets [C]//Proc of the 26th Int'l Conf on Very Large Data Bases. Cairo, Egypt: Morgan Kaufmann Press, 2000:111–122.
Guha S S, Kim Chulyun, Shim Kyuseok. XWAVE: Approximate Extended Wavelets for Streaming Data [C]//Proc of the 30th Intl Conf on Very Large Data Bases. Toronto: Morgan Kaufmann Press, 2004:288–299.
Gehrke J, Korn F, Srivastava D. On Computing Correlated Aggregates over Continual Data Streams [C]//Proc of the 2001 ACM SIGMOD Int Conf on Management of Data. Santa Barbara, California: ACM Press, 2001:13–24.
Chandrasekaran S, Franklin M J. Streaming Queries over Streaming Data [C]//Proc of the 28th Int'l Conf on Very Large Data Bases. Morgan: Kaufmann Press, 2002:203–214.
Datar M, Gionis A, Indyk P,et al. Maintaining Stream Statistics over Sliding Windows [C]//Proc of the 13th ACM-SIAM Symposium on Discrete Algorithms. San Francisco, CA: ACM Press, 2002:635–644.