Efficient techniques for range search queries on earth science data

Qingmin Shi1, J.F. JaJa1
1Institute for Advanced Computer Studies, Department of Electrical and Computer Engineering, University of Maryland, College Park, MD, USA

Tóm tắt

We consider the problem of organizing large scale earth science raster data to efficiently handle queries for identifying regions whose parameters fall within certain range values specified by the queries. This problem seems to be critical to enabling basic data mining tasks such as determining associations between physical phenomena and spatial factors, detecting changes and trends, and content based retrieval. We assume that the input is too large to fit in internal memory and hence focus on data structures and algorithms that minimize the I/O bounds. A new data structure, called a tree-of-regions (ToR), is introduced and involves a combination of an R-tree and efficient representation of regions. It is shown that such a data structure enables the handling of range queries in an optimal I/O time, under certain reasonable assumptions. We also show that updates to the ToR can be handled efficiently. Experimental results for a variety of multi-valued earth science data illustrate the fast execution times of a wide range of queries, as predicted by our theoretical analysis.

Từ khóa

#Geoscience #Large-scale systems #Data mining #Information retrieval #Content based retrieval #Data structures #Computational modeling #Educational institutions #Organizing #Tree data structures

Tài liệu tham khảo

10.1145/280277.280279

guttman, 1984, R-trees: A dynamic index structure for spatial searching, SIGMOD Proceedings ACM SIGMOD International Conference on management of Data, 47, 10.1145/971697.602266

10.1145/170088.170403

10.1006/jcss.1996.0043

10.1109/ICDE.1997.582015

10.1145/318898.318900

10.1145/48529.48535

2000, Content-based search and data mining

10.1007/3-540-48518-X_20

arge, 1995, The buffer tree: A new technique for optimal 110-algorithms, Proc 4th International Workshop on Algorithms and Data Structures (WADS), 334, 10.1007/3-540-60220-8_74

10.1145/93597.98741

10.1145/303976.304010

1999, Goddard DAAC NOAA/NASA Pathfinder AVHRR Land (PAL)

10.1007/BFb0100987

0, Landsat thematic mapper data

10.1145/275487.275501

white, 1996, Algorithms and strategies for similarity retrieval, Technical Report VCL-96–1 0 1

10.1007/3-540-45244-3_6