Dynamic monitoring of high-performance distributed applications
Proceedings 11th IEEE International Symposium on High Performance Distributed Computing - Trang 163-170
Tóm tắt
Developers and users of high-performance distributed systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the source of the performance problems requires detailed end-to-end instrumentation of all components, including the applications, operating systems, hosts, and networks. However, one must be very careful to design the instrumentation to have extremely low overhead, and not affect the system being monitored. In this paper we present a very light-weight instrumentation system that can be dynamically activated to unobtrusively collect and aggregate detailed end-to-end monitoring information from distributed applications. We also show how emerging "web services" can be used to facilitate remote interaction with this system.
Từ khóa
#Distributed computing #Instruments #Pipelines #XML #Condition monitoring #Computer buffers #Grid computing #High performance computing #Laboratories #LibrariesTài liệu tham khảo
0, XEVENTS project web page
vazhkudai, 2001, Replica selection in the Globus Data Grid International Workshop on Data Models and Databases on Clusters and the Grid (DataGrid 2001)
tierney, 2000, Using NetLogger for Distributed Systems Performance Analysis of the BaBar Data Analysis System, Proceedings of Computers in High Energy Physics 2000 (CHEP 2000)
tierney, 0, A Grid Monitoring Service Architecture, Global Grid Forum White Paper
10.1109/HPDC.2001.945200
2000, Simple Object Access Protocol (SOAP) 1 1 W3C Note
10.1016/S0167-739X(99)00025-4
0, Universal Description Discovery and Integration (UDDI)
tuecke, 2002, Internet X.509 Public Key Infrastructure Proxy Certificate Profile, Internet Draft draft-ietf-pkix-proxy-02 txt
10.1109/HPDC.1998.709980
eisenhauer, 2001, Event Services in High Performance Systems, Cluster Computing The Journal of Networks Software Tools and Applications, 4, 243
0, European Data Grid project
fisher, 0, Relational Grid Monitoring Architecture Package
foster, 1999, Globus: A Toolkit-Based Grid Architecture, The Grid Blueprint for a New Computing Infrastructure, 259
10.1145/288090.288111
foster, 2002, The Physiology of the Grid An Open Grid Services Architecture for Distributed Systems Integration
graham, 2001, Building Web Services with Java: Making Sense of XML, SOAP, WSDL, and UDDI, SAMS
0, Globus IO
0, Global Grid Forum (GGF)
0, The GriPhyN Project
1987
cancio, 0, The DataGrid architecture
smith, 0, A Framework for Control and Observation in Distributed Environments, NAS Technical Report Number NAS-01–006
10.1109/SC.2000.10002
0, CORBA. Systems Management: Event Management Service, X/Open Document Number P437
slominski, 0, An Extensible and Interoperable Event System Architecture Using SOAP
christensen, 2001, Web Service Description Language (WSDL), 1 1 W3C Note
dierks, 2002, The TLS Protocol Version 1.0, Internet Draft draft-ietf-tls-rfc2246-bis-01 txt
10.1109/HPDC.2001.945188
allcock, 2001, Secure, Efficient Data Transport and Replica Management for High-Performance Data-Intensive Computing, IEEE Mass Storage Conference
10.1109/HCW.2000.843735
abela, 0, Universal Format for Logger Messages, IETF Internet Draft
0, Jini Distributed Event Specification
0, log4j performance results
0, Log4j
10.17487/rfc1769
1994, Message Passing Interface Forum MPI A Message-Passing Interface Standard
10.1109/HPDC.1998.709970
0, Particle Physics Data Grid (PPDG)