Decoupling computation and data scheduling in distributed data-intensive applications
Proceedings 11th IEEE International Symposium on High Performance Distributed Computing - Trang 352-358
Tóm tắt
In high-energy physics, bioinformatics, and other disciplines, we encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets. So-called Data Grids seek to harness geographically distributed resources for such large-scale data-intensive problems. Yet effective scheduling in such environments is challenging, due to a need to address a variety of metrics and constraints while dealing with multiple, potentially independent sources of jobs and a large number of storage, compute, and network resources. We describe a scheduling framework that addresses these problems. Within this framework, data movement operations may be either tightly bound to job scheduling decisions or, alternatively, performed by a decoupled, asynchronous process on the basis of observed data access patterns and load. We develop a family of algorithms and use simulation studies to evaluate various combinations. Our results suggest that while it is necessary to consider the impact of replication, it is not always necessary to couple data movement and computation scheduling. Instead, these two activities can be addressed separately, thus significantly simplifying the design and implementation.
Từ khóa
#Distributed computing #Processor scheduling #Scheduling algorithm #Application software #Computer science #Large-scale systems #Resource management #Physics computing #Laboratories #BioinformaticsTài liệu tham khảo
berman, 1996, Application-Level Scheduling on Distributed Heterogeneous Networks, Supercomputing'96, 10.1145/369028.369109
10.1109/SPDP.1995.530703
10.1145/301816.301839
braun, 1998, A Taxonomy of scheduling in general-purpose distributed computing systems, Workshop on Advances in Parallel and Distributed Systems (APADS)
10.1155/2000/319291
10.1109/HPDC.2001.945188
fan, 1998, Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol, Proceedings of ACM SIGCOMM'98, 10.1145/285237.285287
10.1177/109434209701100205
foster, 1999, The Grid Blueprint for a New Computing Infrastructure
10.1177/109434200101500302
wolski, 1997, Forecasting Network Performance to Support Dynamic Scheduling Using the Network Weather Service, Proc 6th IEEE Symp on High Performance Distributed Computing
0, Proceedings of Job Scheduling Strategies for Parallel Processing Workshop
10.1145/319151.319153
0, PARSEC Parallel Simulation Environment for Complex Systems
avery, 2001, The GriPhyN Project Towards Petascale Virtual Data Grids
10.1016/S0169-7552(98)00015-4
10.1109/HCW.1999.765123
basney, 2000, Harnessing the Capacity of Computational Grids for High Energy Physics, Computing in High Energy and Nuclear Physics
avery, 2001, An International Virtual-Data Grid Laboratory for Data Intensive Science
0, Fermi National Accelerator Laboratory
10.1016/S0010-4655(01)00276-4
0, CMS-The Compact Muon Solenoid
hamscher, 2000, Evaluation of Job-Scheduling Strategies for Grid Computing, 7th International Conference of High Performance Computing
10.1109/HCW.1999.765094
holtman, 2001, CMS Requirements for the Grid, CHEP
10.1109/SPDP.1990.143505
ranganathan, 2001, Identifying Dynamic Replication Strategies for a High Performance Data Grid, International Workshop on Grid Computing
thain, 2001, Gathering at the Well: Creating Communities for Grid I/O, Supercomputing
10.1109/HPDC.2001.945200