In this paper we present a novel methodology for improving the performance and dependability of application-level messaging in Grid systems. Based on the Network Weather Service, our system uses nonparametric statistical forecasts of request-response times to automatically determine message timeouts. By choosing a timeout based on predicted network performance, the methodology improves application...... hiện toàn bộ
An implementation of a distributed model coupling framework is documented. This framework provides the infrastructure for a data-flow approach for solving the problem of distributed numerical models sharing coupling information. There exists a centralized server which stores coupling information such as surface fluxes. This information is then passed to client applications (numerical models) throu...... hiện toàn bộ
In systems consisting of multiple clusters of processors interconnected by relatively slow communication links, co-allocation may be required. We study its performance by means of simulations, depending on the structure and sizes of jobs, and the communication speed ratio. We model a multicluster with C clusters of identical processors. The workload consists of rigid jobs that require fixed number...... hiện toàn bộ
Performance models provide significant insight into the performance relationships between an application and the system used for execution. The major obstacle to developing performance models is the lack of knowledge about the performance relationships between the different functions that compose an application. This paper addresses the issue by using a coupling parameter, which quantifies the int...... hiện toàn bộ
Dịch vụ thông tin là một phần không thể tách rời trong kiến trúc lưới. Đây là nền tảng để xác định các tài nguyên và trạng thái của chúng. Quan trọng hơn, người dùng của lưới có thể hình dung về cách thức hoạt động của lưới, hiệu suất của nó và những khả năng mà nó có thông qua các dịch vụ thông tin. Sáng kiến Tính toán Chiến lược Tăng tốc (ASCI) đã thiết kế và triển khai một bộ dịch vụ lưới trong...... hiện toàn bộ
#Kiểm soát truy cập #Phòng thí nghiệm #Tính toán lưới #Quản lý tài nguyên #Kiến trúc máy tính #Tính toán phân tán #Bộ lọc #Tăng tốc #Dịch vụ nhận thức theo ngữ cảnh #Mạng máy tính
A novel component-based, service-oriented framework for distributed metacomputing is described. Adopting a provider-centric view of resource sharing, this project emphasizes lightweight software infrastructures that maintain a minimal state, and interface to current and emerging distributed computing standards. Resource owners host a software backplane onto which owners, clients, or third-party, r...... hiện toàn bộ
Error propagation is a central problem in grid computing. We re-learned this while adding a Java feature to the Condor computational grid. Our initial experience with the system was negative, due to the large number of new ways in which the system could fail. To reason about this problem, we developed a theory of error propagation. Central to our theory is the concept of an error's scope, defined ...... hiện toàn bộ
In this work we describe a new approach using relative debugging to find differences in computation between a serial program and a parallel version of that program. We use a combination of re-execution and backtracking in order to find the first difference in computation that may ultimately lead to an incorrect value that the user has indicated. In our prototype implementation we use static analys...... hiện toàn bộ
D. Gunter, B. Tierney, K. Jackson, J. Lee, M. Stoufer
Developers and users of high-performance distributed systems often observe performance problems such as unexpectedly low throughput or high latency. Determining the source of the performance problems requires detailed end-to-end instrumentation of all components, including the applications, operating systems, hosts, and networks. However, one must be very careful to design the instrumentation to h...... hiện toàn bộ