Distributed computing with load-managed active storage
Proceedings 11th IEEE International Symposium on High Performance Distributed Computing - Trang 13-23
Tóm tắt
One approach to high-performance processing of massive data sets is to incorporate computation into storage systems. Previous work has shown that this active storage model is effective for a variety of problems. This paper explores opportunities to use active storage as a basis for exploiting asymmetric parallelism in applications using a streaming computation model on collections of fixed-size records. This model is the basis for much of the research in I/O-efficient algorithms, which deals with an important class of massive data problems not studied in previous work on active storage. We present an extension of a streaming computation model for an external memory toolkit to support a flexible mapping of computations to storage-based processors. Our approach enables load-managed active storage: it exposes parallelism, ordering constraints, and primitive computation units to the system, which can configure the application to balance load and make the best use of available processing power Emulation results from a sorting application demonstrate the potential of dynamic adaptation in load-managed active storage.
Từ khóa
#Distributed computing #Parallel processing #Power system modeling #Computational modeling #Concurrent computing #Computer networks #Large-scale systems #Computer science #Emulation #SortingTài liệu tham khảo
vengroff, 1995, TPIE User Manual and Reference
uysal, 1999, Programming Model Algorithms and Performance Evaluation of Active Disks
10.1145/512161.512180
10.1145/342009.335439
zhang, 1999, HPVM MinuteSort White Paper
10.1145/502034.502057
vitter, 2001, Distribution sort with randomized cycling, Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA-01), 77
10.1145/384192.384193
10.1109/HPCA.1998.650549
10.1109/CCGRID.2001.923176
chiang, 1995, External-memory graph algorithms, Symposium on Discrete Algorithms (SODA), 139
10.1145/291069.291029
gray, 2002, Storage bricks, Talk at Conference on File and Storage Technologies
gray, 2001, In search of petabyte databases, Talk at Conference on High Performance Transaction Systems
gribble, 2000, Scalable, distributed data structures for Internet service construction, Proc 4th Symp Operating Systems Design and Implementation (OSDI 00), 319
10.1016/S1389-1286(00)00179-1
griffin, 2002, Timing-accurate storage emulation, Proceedings of the Conference on File and Storage Technologies (FAST)
keeton, 1999, Computer Architecture Support for Database Applications
rivera-alvarez, 2000, Disk-to-disk parallel sorting on HPVM clusters running Windows NT
amiri, 2000, Dynamic function placement for data-intensive cluster computing, Proceedings of the 2000 USENIX Annual Technical Conference (USENIX-00), 307
riedel, 1998, ctive storage for large-scale data mining and multimedia, Proc Twenty-Fourth Conf Very Large Databases, 62
10.1109/ICDCS.2000.840942
10.1145/253260.253322
10.1109/SSDM.1999.787622
amiri, 2000, Scalable and manageable storage systems
arpaci-dusseau, 1999, Performance availability for networks of workstations
10.1145/281035.281048
10.1145/48529.48535
10.1145/301816.301823
10.1145/291069.291026
10.1145/290593.290602
lumb, 2002, Freeblock scheduling outside of disk firmware, Proceedings of the Conference on File and Storage Technologies (FAST)
lumb, 2000, Towards higher disk head utilization: Extracting “free” bandwidth from busy disk drives, Proc 4th Symp Operating Systems Design and Implementation (OSDI 00), 87
10.1145/342009.335375
riedel, 1999, Active Disks - Remote Execution for Network-Attached Storage
riedel, 1997, Active disks-remote execution for network-attached storage, Technical Report CMU-CS-97–198
10.1109/2.928624