High Performance Interconnect Network for Tianhe System
Tóm tắt
Từ khóa
Tài liệu tham khảo
Liao X, Xiao L, Yang C et al. Milkyway-2 supercomputer system and application. Frontiers of Computer Science, 2014, 8(3): 345–356.
Pritchard H, Gorodetsky I, Buntinas D. A uGNI-based MPICH2 Nemesis network module for the cray XE. In Proc. the 18th European MPI Users' Group Conference on Recent Advances in the Message Passing Interface, Sept. 2011, pp.110-119.
Xie M, Lu Y, Liu L et al. Implementation and evaluation of network interface and message passing services for TianHe-1A supercomputer. In Proc. the 19th IEEE Annual Symposium on High Performance Interconnects, Aug. 2011, pp.78-86.
Kim J, Dally W J, Towles B, Gupta A K. Microarchitecture of a high radix router. In Proc. the 32nd Annual International Symposium on Computer Architecture, June 2005, pp.420-431.
Schoinas I, Hill M D. Address translation mechanisms in network interfaces. In Proc. the 4th International Symposium on High-Performance Computer Architecture, Feb. 1998, pp.219-230.
Chun B N, Mainwaring A, Culler D E. Virtual network transport protocols for Myrinet. IEEE Micro, 1998, 18(1): 53–63.
Araki S, Bilas A, Dubnicki C et al. User-space communication: A quantitative study. In Proc. ACM/IEEE Conference on Supercomputing, Nov. 1998.
Bhoedjang R A F, Ruhl T, Bal H E. User-level network interface protocols. Computer, 1998, 31(11): 53–60.
Graham R L, Poole S, Shamis P et al. Overlapping computation and communication: Barrier algorithms and ConnectX-2 CORE-Direct capabilities. In Proc. IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum, April 2010.
Kandalla K, Subramoni H, Vienne J et al. Designing nonblocking broadcast with collective offload on InfiniBand clusters: A case study with HPL. In Proc. the 19th IEEE Annual Symposium on High Performance Interconnects, Aug. 2011, pp.27-34.
Buntinas D, Goglin B, Goodell D et al. Cache-efficient, intranode, large-message MPI communication with MPICH2-Nemesis. In Proc. International Conference on Parallel Processing, Sept. 2009, pp.462-469.
Lauria M, Pakin S, Chien A. Efficient layering for high speed communication: Fast messages 2.x. In Proc. the 7th International Symposium on High Performance Distributed Computing, July 1998, pp.10-20.
Liu J, Panda D K. Implementing efficient and scalable flow control schemes in MPI over InfiniBand. In Proc. the 18th International Parallel and Distributed Processing Symposium, April 2004.
Vetter J S, Mueller F. Communication characteristics of large-scale scientific applications for contemporary cluster architectures. Journal of Parallel and Distributed Computing, 2003, 63(9): 853–865.
Tezuka H, O’Carroll F, Hori A et al. Pin-down cache: A virtual memory management technique for zero-copy communication. In Proc. Symposium on Parallel and Distributed Processing, Mar. 30-Apr. 3, 1998, pp.308-314.
Chen D, Eisley N A, Heidelberger P et al. The IBM Blue Gene/Q interconnection fabric. IEEE Micro, 2012, 32(1): 32–43.
Alverson R, Roweth D, Kaplan L. The Gemini system interconnect. In Proc. the 18th IEEE Symposium on High Performance Interconnects, Aug. 2010, pp.83-87.
Schroeder B, Gibson G. Understanding failures in petascale computers. J. Physics: Conference Series, 2007, 78: 012022.
Graham R L, Poole S, Shamis P et al. ConnectX-2 Infini-Band management queues: First investigation of the new support for network offloaded collective operations. In Proc. the 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, May 2010, pp.53-62.
Subramoni H, Kandalla K, Sur S et al. Design and evaluation of generalized collective communication primitives with overlap using connectX-2 offload engine. In Proc. the 18th IEEE Annual Symposium on High Performance Interconnects, Aug. 2010, pp.40-49.