Revisiting locality-awareness in view of dynamically changing topologies
Tài liệu tham khảo
M. P. I. Forum, MPI: a message-passing interface standard version 3.1, 2015.
Karonis, 2000, Exploiting hierarchy in parallel computer networks to optimize collective operation performance, 377
Träff, 2003, Improved MPI all-to-all communication on a giganet SMP cluster, 2474, 392
Träff, 2003, SMP-aware message passing programming, 56
Mirsadeghi, 2016, Topology-aware rank reordering for MPI collectives, 1759
Ma, 2011, Process distance-aware adaptive MPI collective communications, 196
Wang, 2012, Proactive process-level live migration and back migration in HPC environments, J. Parallel Distrib. Comput., 72, 254, 10.1016/j.jpdc.2011.10.009
Reghenzani, 2016, The MIG framework: Enabling transparent process migration in Open MPI, 64
Zhang, 2017, High-performance virtual machine migration framework for MPI applications on sr-iov enabled infiniband clusters, 143
S. Pickartz, J. Breitbart, C. Clauss, S. Lankes, A. Monti, Co-Scheduling of HPC Applications, IOS Press, pp. 114–141.
Breitbart, 2017, Dynamic co-scheduling driven by main memory bandwidth utilization, 400
Zhang, 2016, Slurm-v: extending slurm for building efficient hpc cloud with sr-iov and ivshmem, 349
de Alfonso, 2017, Container-based virtual elastic clusters, J. Syst. Softw., 127, 1, 10.1016/j.jss.2017.01.007
Ruivo, 2014, Exploring infiniband hardware virtualization in opennebula towards efficient high-performance computing, 943
Pickartz, 2017, A locality-aware communication layer for virtualized clusters, 605
Pickartz, 2017, Enabling hierarchy-aware MPI collectives in dynamically changing topologies, 2:1
Pickartz, 2018, Prospects and challenges of virtual machine migration in HPC, Concurr. Comput.: Pract. Exp., 10.1002/cpe.4412
Pickartz, 2016, Application migration in HPC—a driver of the exascale era?, 318
Pickartz, 2016, Non-intrusive migration of MPI processes in OS-bypass networks, 1728
Gropp, 1996, A high-performance, portable implementation of the MPI message passing interface standard, Parallel Comput., 22, 789, 10.1016/0167-8191(96)00024-5
Milojičić, 2000, Process migration, ACM Comput. Surv., 32, 241, 10.1145/367701.367728
Clauss, 2016, Dynamic process management with allocation-internal co-scheduling towards interactive supercomputing, 13
Mamidala, 2006, Efficient SMP-aware MPI-level broadcast over infiniband’s hardware multicast, 8pp
Buntinas, 2006, Data transfers between processes in an SMP system: performance study and application to MPI, 487
Chai, 2006, Designing high performance and scalable MPI intra-node communication support for clusters, 1
Zhang, 2009, Process mapping for MPI collective communications, 81
Zhu, 2009, Hierarchical collectives in MPICH2, 5759, 325
Graham, 2008, MPI support for multi-core architectures: optimized shared memory collectives, 5205, 130
Sanders, 2002, The hierarchical factor algorithm for all-to-all communication (research note), 2400, 799
Sanders, 2006, Parallel prefix (Scan) algorithms for MPI, 4192, 49
Träff, 2014, MPI collectives and datatypes for hierarchical all-to-all communication, 27:27
Yeom, 2005, An efficient collective communication method using a shortest path algorithm in a computational grid, 3795, 250
Gupta, 2006, Application-oriented adaptive MPI_Bcast for grids, 128
Park, 2003, Dynamic topology selection for high performance MPI in the grid environments, 2840, 595
Kielmann, 1999, Magpie: MPI’s collective communication operations for clustered wide area systems, SIGPLAN Not., 34, 131, 10.1145/329366.301116
Fagg, 2000, ACCT: automatic collective communications tuning, 1908, 354
Vadhiyar, 2001, Performance modeling for self adapting collective communications for MPI, 23(1), 15
Ramos, 2005, A reconfigurable MPI broadcast function, 6
Godwin, 2012, Runtime optimization of broadcast communications using dynamic network topology information from MPI, 287
Subramoni, 2011, Design and evaluation of network topology-/speed-aware broadcast algorithms for infiniband clusters, 317
Gong, 2015, Network performance aware mpi collective communication operations in the cloud, IEEE Trans. Parallel Distrib. Syst., 26, 3079, 10.1109/TPDS.2013.96
Buntinas, 2006, Design and evaluation of nemesis, a scalable, low-latency, message-passing communication subsystem, 10
Buntinas, 2007, Implementation and evaluation of shared-memory communication and synchronization operations in MPICH2 using the nemesis communication subsystem, Parallel Comput., 33, 634, 10.1016/j.parco.2007.06.003
Buntinas, 2009, Cache-efficient, intranode, large-message MPI communication with MPICH2-nemesis, 462
Gropp, 2001, MPICH Abstract Device Interface Version 3.3 Reference Manual
Balaji, 2010, PMI: a Scalable Parallel Process-management Interface for Extreme-scale Systems, 6305, 31
Kivity, 2007, kvm: the Linux Virtual Machine Monitor, 225
Bellard, 2005, QEMU, a fast and portable dynamic translator, 46
Intel Virtualization Technology for Directed I/O, 2014
Access Division, 2011, PCI-SIG SR-IOV Primer
Pickartz, 2014, Migration techniques in hpc environments, 486
Macdonell, 2011
Zhang, 2014, Can inter-vm shmem benefit mpi applications on sr-iov based virtualized infiniband clusters?, 342
Zhang, 2014, High performance MPI library over SR-IOV enabled InfiniBand clusters, 1
MPI, 2014, Benchmarks