A knowledge-theoretic analysis of uniform distributed coordination and failure detectors
Tóm tắt
It is shown that, in a precise sense, if there is no bound on the number of faulty processes in a system with unreliable but fair communication, Uniform Distributed Coordination (UDC) can be attained if and only if a system has perfect failure detectors. This result is generalized to the case where there is a bound t on the number of faulty processes. It is shown that a certain type of generalized failure detector is necessary and sufficient for achieving UDC in a context with at most t faulty processes. Reasoning about processes’ knowledge as to which other processes are faulty plays a key role in the analysis.
Tài liệu tham khảo
Aguilera, M.K., Chen, W., Toueg, S.: Heartbeat: a timeout-free failure detector for quiescent reliable communication. In: Proc. 11th International Workshop on Distributed Algorithms, Springer, 1997, pp. 126-140. A full version is also available as Technical Report 97-1631, Department of Computer Science, Cornell University, 1997.
Aguilera, M.K., Toueg, S., Deianov, B.: Revisiting the weakest failure detector for uniform reliable broadcast. In: Proc. 13th International Symposium on Distributed Computing, 1999, pp. 19-33.
Birman, K., Joseph, T.: Exploiting virtual synchrony in distributed systems. In: 11th Symposium on Operating System Principles, 1987, pp. 123-138.
Chandra, T.D., Hadzilacos, V., Toueg, S.: The weakest failure detector for solving consensus. J. ACM 43:685-722 (1996).
Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2):225-267 (1996).
Coan, B.: A communication-efficient canonical form for fault-tolerant distributed protocols. In: Proc. 5th ACM Symp. on Principles of Distributed Computing, 1986, pp. 63-72.
Fagin, R., Halpern, J.Y., Moses, Y., Vardi, M.Y.: Reasoning about Knowledge. MIT Press, Cambridge, Mass., 1995.
Fagin, R., Halpern, J.Y., Moses, Y., Vardi, M.Y.: Knowledge-based programs. Distrib. Comput. 10(4):199-225 (1997).
Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty processor. J. ACM 32(2):374-382 (1985).
Gopal, A., Toueg, S.: Reliable broadcast in synchronous and asynchronous environments. In: 3rd WDAG. Springer, volume 392 of Lecture Notes in Computer Science, Springer, Berlin Heidelberg New York, 1989, pp. 110-123.
Halpern, J.Y., Ricciardi, A.: A knowledge-theoretic analysis of uniform distributed coordination and failure detectors. In: Proc. 18th ACM Symp. on Principles of Distributed Computing, 1999, pp. 73-82.
Schiper, A., Sandoz, A.: Uniform reliable multicast in a virtually synchronous environment. In: Proc. IEEE 13th ICDCS. 1993.