The complexity of adding failsafe fault-tolerance
Tóm tắt
In this paper, we focus our attention on the problem of automating the addition of failsafe fault-tolerance where fault-tolerance is added to an existing (fault-intolerant) program. A failsafe fault-tolerant program satisfies its specification (including safety and liveness) in the absence of faults. And, in the presence of faults, it satisfies its safety specification. We present a somewhat unexpected result that, in general, the problem of adding failsafe fault-tolerance in distributed programs is NP-hard. Towards this end, we reduce the 3-SAT problem to the problem of adding failsafe fault-tolerance. We also identify a class of specifications, monotonic specifications and a class of programs, monotonic programs. Given a (positive) monotonic specification and a (negative) monotonic program, we show that failsafe fault-tolerance can be added in polynomial time. We note that the monotonicity restrictions are met for commonly encountered problems such as Byzantine agreement, distributed consensus, and atomic commitment. Finally, we argue that the restrictions on the specifications and programs are necessary to add failsafe fault-tolerance in polynomial time; we prove that if only one of these conditions is satisfied, the addition of failsafe fault-tolerance is still NP-hard.
Từ khóa
#Fault tolerance #Safety #Polynomials #Algorithm design and analysis #Automation #Computer science #Fault tolerant systems #Engineering profession #Contracts #Fault diagnosisTài liệu tham khảo
kulkarni, 2002, The complexity of adding failsafe fault-tolerance. Technical Report MSU-CSE.02–10, Computer Science and Engineering
10.1145/277697.277729
kupferman, 1997, Synthesis with incomplete information, ICTL
10.1145/75277.75293
10.1007/3-540-58179-0_51
10.1145/152610.152612
10.1109/RELDIS.2001.969767
gong, 1995, Byzantine agreement with authentication: Observations and applications in tolerating hybrid and link faults, Dependable Computing and Fault Tolerant Systems IEEE Comnuter Society, 10, 139
10.1145/357172.357176
10.1109/32.256850
10.1016/0020-0190(85)90056-0
kulkarni, 2000, Automating the addition of fault-tolerance, Formal Techniques in Real-Time and Fault-Tolerant Systems, 10.1007/3-540-45352-0_9
kulkarni, 1999, Component-based design of fault-tolerance
10.1145/383043.383044