Mapping Large Spatial Flow Data with Hierarchical Clustering
Tóm tắt
It is challenging to map large spatial flow data due to the problem of occlusion and cluttered display, where hundreds of thousands of flows overlap and intersect each other. Existing flow mapping approaches often aggregate flows using predetermined high‐level geographic units (e.g. states) or bundling partial flow lines that are close in space, both of which cause a significant loss or distortion of information and may miss major patterns. In this research, we developed a flow clustering method that extracts clusters of similar flows to avoid the cluttering problem, reveal abstracted flow patterns, and meanwhile preserves data resolution as much as possible. Specifically, our method extends the traditional hierarchical clustering method to aggregate and map large flow data. The new method considers both origins and destinations in determining the similarity of two flows, which ensures that a flow cluster represents flows from similar origins to similar destinations and thus minimizes information loss during aggregation. With the spatial index and search algorithm, the new method is scalable to large flow data sets. As a hierarchical method, it generalizes flows to different hierarchical levels and has the potential to support multi‐resolution flow mapping. Different distance definitions can be incorporated to adapt to uneven spatial distribution of flows and detect flow clusters of different densities. To assess the quality and fidelity of flow clusters and flow maps, we carry out a case study to analyze a data set of 243,850 taxi trips within an urban area.
Từ khóa
Tài liệu tham khảo
AgrawalR GehrkeJ GunopulosD andRaghavanP1998Automatic subspace clustering of high dimensional data for data mining applications. InProceedings of the ACM SIGMOD Conference on Management of Data Seattle Washington:94–105
EsterM KriegelH‐P SanderJ andXuX1996A density‐based algorithm for discovering clusters in large spatial databases with noise. InProceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD‐96) Portland Oregon 226–231
Groff E, 2006, Exploring the Spatial Configuration of Places Related to Homicide Events
Han J, 2000, Data Mining: Concepts and Techniques
NgR TandHanJ1994Efficient and effective clustering methods for spatial data mining. InProceedings of the Twentieth International Conference on Very Large Databases Santiago Chile
Openshaw S, 1983, The Modifiable Areal Unit Problem
PhanD XiaoL YehR HanrahanP andTerryW2005Flow map layout. InProceedings of the IEEE Symposium on Information Visualization (InfoVis 2005) Minneapolis Minnesota
Yan J, 2008, Self‐Organizing Maps: Applications in Geographic Information Science, 67, 10.1002/9780470021699.ch4