Consistent global checkpoints based on direct dependency tracking

Information Processing Letters - Tập 50 - Trang 223-230 - 1994
Yi-Min Wang1, Andy Lowry2, W.Kent Fuchs3
1AT&T Bell Laboratories 600 Mountain Avenue Murray Hill, NJ 07974 USA
2IBM T.J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598, USA
3Coordinated Science Laboratory, University of Illinois Urbana, IL 61801, USA

Tài liệu tham khảo

Anderson, 1987 Bhargave, 1988, Independent checkpointing and concurrent rollback for recovery — An optimistic approach, Proc. IEEE Symp. on Reliable Distributed Systems, 3 Bogart, 1983 Chandy, 1985, Distributed snapshots: Determining global states of distributed systems, ACM Trans. Comput. Systems, 3, 63, 10.1145/214451.214456 Johnson, 1990, Recovery in distributed systems using optimistic message logging and checkpointing, J. Algorithms, 11, 462, 10.1016/0196-6774(90)90022-7 Koo, 1987, Checkpointing and rollback-recovery for distributed systems, IEEE Trans. Software Engineering, 13, 23, 10.1109/TSE.1987.232562 Lamport, 1978, Time, clocks and the ordering of events in a distributed system, Comm. ACM, 21, 558, 10.1145/359545.359563 Lowry, 1991, Optimistic failure recovery for very large networks, Proc. IEEE Symp. on Reliable Distributed Systems, 66 Sistla, 1989, Efficient distributed recovery using message logging, Proc. 8th ACM Symp. on Principles of Distributed Computing, 223, 10.1145/72981.72997 Strom, 1985, Optimistic recovery in distributed systems, ACM Trans. Comput. Systems, 3, 204, 10.1145/3959.3962 Tsuruoka, 1981, Dynamic recovery schemes for distributed processes, Proc. IEEE 2nd Symp. on Reliability in Distr. Software and Database Systems, 124 Wang, 1992, Checkpoint space reclamation for uncoordinated checkpointing in message-passing systems Wang, 1992, Scheduling message processing for reducing rollback propagation, Proc. IEEE Fault-Tolerant Computing Symp. (FTCS-22), 204 Wang, 1993, Progressive retry for sofware error recovery in distributed systems, Proc. IEEE Fault-Tolerant Computing Symp. (FTCS-23), 138