Predicting sporadic grid data transfers
Proceedings 11th IEEE International Symposium on High Performance Distributed Computing - Trang 188-196
Tóm tắt
The increasingly common practice of replicating datasets and using resources as distributed data stores in grid environments has led to the problem of determining which replica can be accessed most efficiently. Due diverse performance characteristics and load variations of several components in the end-to-end path linking these various locations, selecting a replica from among many requires accurate prediction information of the data transfer times between the sources and sinks. In this paper we present a prediction system that is based on combining end-to-end application throughput observations and network load variations, capturing the whole-system performance and variations in load patterns, respectively. We develop a set of regression models to derive predictions that characterize the effect of network load variations on file transfer times. We apply these techniques to the GridFTP data movement tool, part of the Globus Toolkit/spl trade/, and observe performance gains of up to 10% in prediction accuracy when compared with approaches based on past system behavior in isolation.
Từ khóa
#Load management #Predictive models #Computer science #Grid computing #Mathematics #Distributed computing #Joining processes #Throughput #Performance gain #AccuracyTài liệu tham khảo
zaki, 1996, Customized Dynamic Load Balancing for Network of Workstations, Proc IEEE HPDC '96
10.1023/A:1019025230054
tirumala, 2001, The TCP/UDP Bandwidth Measurement Tool
thomasian, 1986, Analysis Queuing Network Models for Parallel Processing of Task Systems, IEEE Transactions on Computers C-35, 12
samar, 2001, Grid Data Management Pilot (GDMP): A Tool for Wide Area Replication, IASTED International Conference on Applied Informatics (AI2001)
smith, 1998, Predicting Application Run Times Using Historical Information, Proceedings of the IPPS/SPDP '98 Workshop on Job Scheduling Strategies for Parallel Processing, 10.1007/BFb0053984
0
vazhkudai, 0, Predicting the Performance of Wide Area Data-Transfers, Proceedings of the 16th Int'l Parallel and Distributed Processing Symposium (IPDPS 2002)
vazhkudai, 0, GridFTP Predictor Trace Data
terekhov, 2000, Distributed Data Access and Resource Management in the D0 SAM System, Proc of the HPDC'00 2000
edwards, 1984, An Introduction to Linear Regression and Correlation
10.1109/HCW.1998.666541
faerman, 1999, Adaptive Performance Prediction for Distributed Data-Intensive Applications, Proceedings of the ACM/IEEE SC99 Conference on High Performance Networking and Computing
0
geisler, 1999, Performance Coupling: Case Studies for Measuring the Interactions of Kernels in Modern Applications, Proc SPEC Workshop on Performance Evaluation with Realistic Applications
holtman, 2000, Object Level Replication for Physics, Proceedings of 4th Annual Globus Retreat
hoschek, 2000, Data Management in an International Grid Project, 2000 International Workshop on Grid Computing (GRID 2000), 10.1007/3-540-44444-0_8
hafeez, 2000, A Data Grid Prototype for Distributed Data Production in CMS, 7th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT2000)
jones, 0, The Public Netperf Homepage
0
10.1109/HPDC.2000.868631
10.1006/jnca.2000.0110
10.1109/IPPS.1998.669995
baru, 1998, The SDSC Storage Resource Broker, Proceedings of CASCON'98
10.1145/169627.169856
schopf, 1997, Structural Prediction Models for High Performance Distributed Applications, Proceedings of the Cluster Computing Conference (CCC '97)
cole, 1989, Algorithmic Skeletons Structured Management of Parallel Computation
0, The European Data Grid Project
crovella, 1999, Performance Prediction and Tuning of Parallel Programs
allcock, 2001, High-Performance Remote Access to Climate Simulation Data: A Challenge Problem for Data Grid Technologies, Proceedings of Supercomputing (SC'01), 10.1145/582034.582080
10.1109/IPPS.1997.580894
adve, 1993, Analyzing the Behavior and Performance of Parallel Programs
10.1109/71.80155
0, NetLogger A Methodology for Monitoring and Analysis of Distributed Systems
malon, 2001, Grid-enabled Data Access in the ATLAS Athena Framework, Proceedings of Computing and High Energy Physics 2001 (CHEP'01) Conference
ostle, 1988, Statistics in Research
newman, 0, The Particle Physics Data Grid
0, SARA The Synthetic Aperture Radar Atlas
pankratz, 1991, Forecasting with Dynamic Regression Models, 10.1002/9781118150528