Adaptive checkpointing for time warp technique with a limited number of checkpoints
Tóm tắt
This paper discusses distributed checkpointing with "Time Warp techniques", a typical uncoordinated checkpointing technique that is often used in the parallel and distributed simulations. Relaxing the assumption of the previous model of Soliman et al., we show a discrete time model where the number of available checkpoints each process can hold is finite. In addition, we propose an adaptive distributed checkpointing technique, that gives an effective time arrangement of checkpoints for a recovery point distribution, and we give numerical examples.
Từ khóa
#Checkpointing #Clocks #Time warp simulation #Analytical models #Distributed computing #ConferencesTài liệu tham khảo
10.1109/TSE.1987.232562
10.1145/214451.214456
10.1109/71.737697
10.1109/71.780864
kameda, 1994, Distributed Algorithm Kindai-Kagaku-sha
tanikoshi, 2001, A Note on Distributed Checkpointing with Time Warp Techniques, IEICE Technical Report Fault tolerant Systems FTS 2001-31, 49
plank, 1997, An Overview of Checkpointing in Uniprocessor and Distributed Systems, Focusing on Implementation and Performance, Technical Report UT-CS-97-372
10.1109/71.730524
fukumoto, 1996, Analysis of the File Recovery Mechanism by Archive Copies, IEICEJ Trans D-II, j79 d i, 206
elnozahy, 1999, A Survey of Rollback-Recovery Protocols in Message-Passing Systems, Technical Report CMU-CS-99-148