Читать книгу From Traditional Fault Tolerance to Blockchain - Wenbing Zhao - Страница 54

2.3 Log Based Protocols

Checkpoint-based protocols only ensure to recover the system up to the most recent consistent global state that has been recorded and all executions happened afterwards, if any, are lost. Logging can be used to recover the system to the state right before the failure, provided that the piecewise deterministic assumption is valid. In log based protocols, the execution of a process is modeled as consecutive state intervals [21]. Each state interval is initiated by a nondeterministic event (such as the receiving of a message) or the initialization of the process, and followed by a sequence of deterministic state changes. As long as the nondeterministic event is logged, the entire state interval can be replayed.

As an example, three state intervals are shown in Figure 2.10. The first state interval starts at the initialization of the process Pi and ends right before it executes the first message, m₁ received. Note that the sending of message m₀ is not considered a nondeterministic event. The second state interval is initiated by the receiving event of message m₁ and ends prior to the receipt of m₃. Similarly, the third state interval starts with the receiving event of m₃ and ends prior to the receipt of m₅.

In the remaining of this section, we assume that the only type of nondeterministic events is the receiving of application messages. Therefore, logging is synonymous with message logging.

Figure 2.10 Example state intervals.

For all practical purposes, logging is always used in conjunction with checkpointing to enjoy two benefits:

1 It limits the recovery time because to recover from a failure the process can be restarted from its last checkpoint (instead from its initial state) and its state can be recovered prior to the failure by replaying the logged nondeterministic events.
2 It limits the size of the log. By taking a checkpoint periodically, the logged events prior to the checkpoint can be garbage collected.

Logging protocols can be classified into three types [7]:

◾ Pessimistic logging. A message received is synchronously logged prior to its execution.

◾ Optimistic logging. To reduce the latency overhead, the nondeterministic events are first stored in volatile memory and logged asynchronously to stable storage. Consequently, the failure of a process might result in permanent loss of some messages, which would force a rollback to a state earlier than the state when the process fails.

◾ Causal logging. The nondeterministic events (and their determinant, such as delivery order of messages received at a process) that have not yet logged to stable storage are piggybacked with each message sent. With the piggy-backed information, a process can have access all the nondeterministic events that may have causal effects on its state, thereby enabling a consistent recovery of the system upon a failure.

In both optimistic logging [21, 19, 20] and causal logging protocols [1], the dependency of the processes has to be tracked and sufficient dependency information has to be piggybacked with each message sent. This not only increases the complexity of the logging mechanisms, but most importantly, makes the failure recovery more sophisticated and expensive because the recovering process has to find a way to examine its logs and determines if it is missing any messages and often causes cascading recovery operations at other processes.

On the other hand, pessimistic logging protocols are much simpler in their design and implementation and failure recovery can be made much faster [11] (specific advantages will be elaborated in section 2.3.1 below). Therefore, our discussion will focus on the pessimistic logging techniques and there will be no further elaboration on optimistic and causal logging.

From Traditional Fault Tolerance to Blockchain

Подняться наверх