Network performance-aware collective communication for clustered wide-area systems
Tóm tắt
Từ khóa
Tài liệu tham khảo
A. Alexandrov, M.F. Ionescu, K.E. Schauser, C. Scheiman, LogGP: incorporating long messages into the LogP model – one step closer towards a realistic model for parallel computation, in: Proceedings of the Symposium on Parallel Algorithms and Architectures (SPAA), Santa Barbara, CA, July 1995, pp. 95–105
Bal, 1998, Performance evaluation of the Orca shared object system, ACM Trans. Comput. Syst., 16, 1, 10.1145/273011.273014
M. Banikazemi, V. Moorthy, D. Panda, Efficient collective communication on heterogeneous networks of workstations, in: International Conference on Parallel Processing, Minneapolis, MN, August 1998, pp. 460–467
Bernaschi, 1998, Collective communication operations: experimental results vs. theory, Concurrency: Practice and Experience, 10, 359, 10.1002/(SICI)1096-9128(19980425)10:5<359::AID-CPE323>3.0.CO;2-7
Boden, 1995, Myrinet: a gigabit-per-second local area network, IEEE Micro, 15, 29, 10.1109/40.342015
Bruck, 1996, On the design and implementation of broadcast and global combine operations using the postal model, IEEE Trans. Parallel Distrib. Syst., 7, 256, 10.1109/71.491579
D. Culler, R. Karp, D. Patterson, A. Sahay, K.E. Schauser, E. Santos, R. Subramonian, T. von Eicken, LogP: towards a realistic model of parallel computation, in: Proceedings of the Symposium on Principles and Practice of Parallel Programming (PPoPP), San Diego, CA, May 1993, pp. 1–12
G.E. Fagg, K.S. London, J.J. Dongarra, MPI_Connect: managing heterogeneous MPI applications interoperation and process control, in: Proceedings of the 5th European PVM/MPI Users' Group Meeting, number 1497 in LNCS, Liverpool, UK, 1998, pp. 93–96
Foster, 1997, Globus: a metacomputing infrastructure toolkit, Int. J. Supercomput. Appl., 11, 115, 10.1177/109434209701100205
I. Foster, C. Kesselman (Eds.), The GRID: Blueprint for a New Computing Infrastructure, Morgan Kaufmann, Los Altos, MA, 1998
E. Gabriel, M. Resch, T. Beisel, R. Keller, Distributed computing in a heterogeneous computing environment, in: Proceedings of the 5th European PVM/MPI Users' Group Meeting number 1497 in LNCS, Liverpool, UK, 1998, pp. 180–187
W. George, J. Hagedorn, J. Devaney, Status report on the development of the interoperable MPI protocol, in: Proceedings of MPIDC'99, Message Passing Interface Developer's and User's Conference, Atlanta, GA, March 1999, pp. 7–13
M. Gołebiewski, R. Hempel, J.L. Träff, Algorithms for collective communication operations on SMP clusters, in: The 1999 Workshop on Cluster-Based Computing, held in conjunction with 13th ACM-SIGARCH International Conference on Supercomputing (ICS'99), 1999, pp. 11–15
S. Gorlatch, C. Wedler, C. Lengauer, Optimization rules for programming with collective operations, in: Proceedings of the 13th International Parallel Processing Symposium & 10th Symposium on Parallel and Distributed Processing (IPPS/SPDP'99), 1999, pp. 492–499
J.-P. Goux, S. Kulkarni, J. Linderoth, M. Yoder, An enabling framework for master–worker applications on the computatinal grid, in: Proceedings of the High Performance Distributed Computing (HPDC 2000), Pittsburgh, PA, August 2000, pp. 43–50
A.S. Grimshaw, W.A. Wulf, and the Legion team, The legion vision of a worldwide virtual computer, Commun. ACM 40 (1) (1997) 39–45
Gropp, 1996, A high-performance, portable implementation of the MPI message passing interface standard, Parallel Comput., 22, 789, 10.1016/0167-8191(96)00024-5
W.D. Gropp, E. Lusk, D. Swider, Improving the performance of MPI derived datatypes, in: Proceedings of MPIDC'99, Message Passing Interface Developer's and User's Conference, Atlanta, GA, March 1999, pp. 25–30
P. Husbands, J.C. Hoe, MPI-StarT: delivering network performance to numerical applications, in: Proceedings of SC'98, November 1998; Online at http://www.supercomp.org/sc98/proceedings/
G. Iannello, M. Lauria, S. Mercolino, Cross-platform analysis of fast messages for Myrinet, in: Proceedings of the Workshop on CANPC'98, Lecture Notes in Computer Science, vol. 1362, Las Vegas, Nevada, Springer, Berlin, January 1998, pp. 217–231
N.T. Karonis, B.R. de Supinski, I. Foster, W. Gropp, E. Lusk, J. Bresnahan, Exploiting hierarchy in parallel computer networks to optimize collective operation performance, in: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS 2000), Cancun, Mexico, IEEE, New York, May 2000, pp. 377–384
R.M. Karp, A. Sahay, E.E. Santos, K.E. Schauser, Optimal broadcast and summation in the LogP model, in: Proceedings of the Symposium on Parallel Algorithms and Architectures (SPAA), Velen, Germany, June 1993, pp. 142–153
R. Kesavan, D.K. Panda, Optimal multicast with packetization and network interface support, in: Proceedings of the International Conference on Parallel Processing, IEEE, New York, August 1997, pp. 370–377
T. Kielmann, H.E. Bal, S. Gorlatch, Bandwidth-efficient collective communication for clustered wide area systems, in: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS 2000), Cancun, Mexico, IEEE, New York, May 2000, pp. 492–499
T. Kielmann, H.E. Bal, K. Verstoep, Fast measurement of LogP parameters for message passing platforms, in: 4th Workshop on Runtime Systems for Parallel Programming (RTSPP), Lecture Notes in Computer Science, vol. 1800, Cancun, Mexico, Springer, Berlin, May 2000, pp. 1176–1183
T. Kielmann, R.F.H. Hofman, H.E. Bal, A. Plaat, R.A.F. Bhoedjang, MPI's reduction operations in clustered wide area systems, in: Proceedings of MPIDC'99, Message Passing Interface Developer's and User's Conference, Atlanta, GA, March 1999, pp. 43–52
T. Kielmann, R.F.H. Hofman, H.E. Bal, A. Plaat, R.A.F. Bhoedjang, MagPIe: MPI's collective communication operations for clustered wide area systems, in: Proceedings of the Symposium on Principles and Practice of Parallel Programming (PPoPP), Atlanta, GA, May 1999, pp. 131–140
B. Lowekamp, A. Beguelin, ECO: efficient collective operations for communication on heterogeneous networks, in: International Parallel Processing Symposium (IPPS), Honolulu, HI, 1996, pp. 399–405
Maillet, 1995, On efficiently implementing global time for performance evaluation on multiprocessor systems, J. Parallel Distrib. Comput., 28, 84, 10.1006/jpdc.1995.1090
J.-Y.L. Park, H.-A. Choi, N. Nupairoj, L.M. Ni, Construction of optimal multicast trees based on the parameterized communication model, in: Proceedings of the International Conference on Parallel Processing (ICPP), vol. I, 1996, pp. 180–187
V. Paxson, On calibrating measurements of packet transit times, in: Proceedings of SIGMETRICS'98/PERFORMANCE'98, Madison, Wisconsin, June 1998, pp. 11–21
Santos, 1999, Optimal and near-optimal algorithms for k-item broadcast, J. Parallel Distrib. Comput., 57, 121, 10.1006/jpdc.1999.1529
van de Geijn, 1994, On global combine operations, J. Parallel Distrib. Comput., 22, 324, 10.1006/jpdc.1994.1091
Watts, 1995, A pipelined broadcast for multidimensional meshes, Parallel Process. Lett., 5, 281, 10.1142/S0129626495000266