Fast asynchronous uniform consensus in real-time distributed systems

IEEE Transactions on Computers - Tập 51 Số 8 - Trang 931-944 - 2002
J.-F. Hermant1, G. Le Lann1
1Domaine de Voluceau, Inria, Le Chesnay, France

Tóm tắt

We investigate whether asynchronous computational models and asynchronous algorithms can be considered for designing real-time distributed fault-tolerant systems. A priori, the lack of bounded finite delays is antagonistic with timeliness requirements. We show how to circumvent this apparent contradiction, via the principle of "late binding" of a solution to some (partially) synchronous model. This principle is shown to maximize the coverage of demonstrated safety, liveness, and timeliness properties. These general results are illustrated with the uniform consensus (UC) and real-time UC problems, assuming processor crashes and reliable communications, considering asynchronous solutions based upon unreliable failure detectors. We introduce the concept of fast failure detectors and show that the problem of building strong or perfect fast failure detectors in real systems can be stated as a distributed message scheduling problem. A generic solution to this problem is given and illustrated considering deterministic Ethernets. In passing, it is shown that, with our construction of unreliable failure detectors, asynchronous algorithms that solve UC have a worst-case termination lower bound that matches the optimal synchronous lower bound. Finally, we introduce FastUC, a novel solution to UC, that is based upon fast failure detectors.

Từ khóa

#Real time systems #Detectors #Distributed computing #Computational modeling #Algorithm design and analysis #Fault tolerant systems #Delay #Safety #Computer crashes #Buildings

Tài liệu tham khảo

hermant, 1998, A Protocol and Correctness Proofs for Real-Time High-Performance Broadcast Networks, Proc IEEE Int'l Conf Distributed Computing Systems, 360 10.1145/226643.226647 10.1145/7531.7533 10.1109/71.774912 10.1109/49.53013 10.1145/42282.42283 10.1145/3149.214121 10.1016/0020-0190(82)90033-3 10.1109/FTDCS.1997.644722 10.1145/343477.343630 10.1109/ICDSN.2000.857587 10.1109/5.469298 mostefaoui, 2000, Consensus Based on Failure Detectors with a Perpetual Weak Accuracy Property, Proc IEEE Int'l Parallel and Distributed Processing Symp, 514 10.1007/BF01088855 lynch, 1996, Distributed Algorithms 10.1145/321738.321743 le lann, 1989, Process and Device for the Transmission of Messages between Different Stations through a Local Distribution Network 10.1007/3-540-48169-9_3 le lann, 1995, On Real-Time and Non Real-Time Distributed Computing, Proc Int l Workshop Distributed Algorithms, 972, 51, 10.1007/BFb0022137 le lann, 1996, Proof-Based System Engineering and Embedded Systems, Proc European School on Embedded Systems, 1494, 208 le lann, 2001, Is 'Asynchronous Real-Time' an Oxymoron?, 15th Int l Symp Distributed Computing hermant, 1999, Quelques Probl�mes et Solutions en Ordonnancement Temps R�el pour Syst�mes R�partis 10.1109/ISORC.1998.666800 1999 10.1145/861.870