Performance analysis of MPI collective operations

Jelena Pješivac–Grbović1, Thara Angskun2, George Bosilca2, Graham E. Fagg2, Edgar Gabriel3, Jack Dongarra2
1Innovative Computing Laboratory, Computer Science Department, University of Tennessee, Knoxville, USA 37996-3450
2Innovative Computing Laboratory, Computer Science Department, University of Tennessee, Knoxville, USA
3Department of Computer Science, University of Houston, Houston, USA

Tóm tắt

Từ khóa


Tài liệu tham khảo

Rabenseifner, R.: Automatic MPI counter profiling of all users: First results on a CRAY T3E 900-512. In: Proceedings of the Message Passing Interface Developer’s and User’s Conference, 1999, pp. 77–85

Vadhiyar, S.S., Fagg, G.E., Dongarra, J.J.: Automatically tuned collective communications. In: Proceedings of the 2000 ACM/IEEE conference on Supercomputing (CDROM), IEEE Computer Society, 2000, p. 3

Hockney, R.: The communication challenge for MPP: Intel Paragon and Meiko CS-2. Parallel Comput. 20(3), 389–398 (1994)

Culler, D., Karp, R., Patterson, D., Sahay, A., Schauser, K.E., Santos, E., Subramonian, R., von Eicken, T.: LogP: Towards a realistic model of parallel computation. In: Proceedings of the fourth ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 1–12. ACM Press, New York (1993)

Alexandrov, A., Ionescu, M.F., Schauser, K.E., Scheiman, C.: LogGP: Incorporating long messages into the LogP model. In: Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures, pp. 95–105. ACM Press, New York (1995)

Kielmann, T., Bal, H., Verstoep, K.: Fast measurement of LogP parameters for message passing platforms. In: Rolim, J.D.P. (ed.) IPDPS Workshops, Cancun, Mexico. Lecture Notes in Computer Science, vol. 1800, pp. 1176–1183. Springer-Verlag, London (2000)

Culler, D., Liu, L.T., Martin, R.P., Yoshikawa, C.: Assessing fast network interfaces. IEEE Micro 16, 35–43 (1996)

Fagg, G.E., Gabriel, E., Chen, Z., Angskun, T., Bosilca, G., Bukovsky, A., Dongarra, J.J.: Fault tolerant communication library and applications for high performance computing. In: LACSI Symposium, 2003

Grama, A., Gupta, A., Karypis, G., Kumar, V.: Introduction to Parallel Computing, second edn. Pearson Education Limited, Addison-Wesley Logman, Boston (2003)

Thakur, R., Gropp, W.: Improving the performance of collective operations in MPICH. In: Dongarra, J., Laforenza, D., Orlando, S. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 2840, pp. 257–267. Springer Verlag, ??? (2003), 10th European PVM/MPI User’s Group Meeting, Venice, Italy

Chan, E.W., Heimlich, M.F., Purkayastha, A., van de Geijn, R.M.: On optimizing of collective communication. In: Cluster. (2004)

Rabenseifner, R., Träff, J.L.: More efficient reduction algorithms for non-power-of-two number of processors in message-passing parallel systems. In: Proceedings of EuroPVM/MPI. Lecture Notes in Computer Science. Springer-Verlag, Berlin (2004)

Kielmann, T., Hofman, R.F.H., Bal, H.E., Plaat, A., Bhoedjang, R.A.F.: MagPIe: MPI’s collective communication operations for clustered wide area systems. In: Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming, pp. 131–140. ACM, New York (1999)

Barchet-Estefanel, L.A., Mounié, G.: Fast tuning of intra-cluster collective communications. In: Proceedings, 11th European PVM/MPI Users’ Group Meeting, Budapest, Hungary, 2004, pp. 28–35

Bell, C., Bonachea, D., Cote, Y., Duell, J., Hargrove, P., Husbands, P., Iancu, C., Welcome, M., Yelick, K.: An evaluation of current high-performance networks. In: Proceedings of the 17th International Symposium on Parallel and Distributed Processing, p. 28.1. IEEE Computer Society, Washington (2003)

Bernaschi, M., Iannello, G., Lauria, M.: Efficient implementation of reduce-scatter in MPI. J. Syst. Archit. 49(3), 89–108 (2003)

Bruck, J., Ho, C.T., Kipnis, S., Upfal, E., Weathersby, D.: Efficient algorithms for all-to-all communications in multiport message-passing systems. IEEE Trans. Parallel Distributed Syst. 8(11), 1143–1156 (1997)

Kielmann, T., Bal, H.E., Gorlatch, S., Verstoep, K., Hofman, R.F.: Network performance-aware collective communication for clustered wide-area systems. Parallel Comput. 27(11), 1431–1456 (2001)

Gropp, W., Lusk, E., Doss, N., Skjellum, A.: A high-performance, portable implementation of the MPI message passing interface standard. Parallel Comput. 22(6), 789–828 (1996)

Gropp, W., Lusk, E.L.: Reproducible measurements of MPI performance characteristics. In: Proceedings of the 6th European PVM/MPI Users’ Group Meeting on Recent Advances in PVM and MPI, pp. 11–18. Springer-Verlag, London (1999)

Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Proceedings, 11th European PVM/MPI Users’ Group Meeting, Budapest, Hungary, 2004, pp. 97–104