Читать книгу From Traditional Fault Tolerance to Blockchain - Wenbing Zhao - Страница 38

2.1.2 Process State and Global State

Оглавление

The state of an individual process is defined by its entire address space in an operating system. A generic checkpointing library (such as Condor [23]) normally saves the entire address space as a checkpoint of the process. Of course, not everything in the address space is interesting based on the application semantics. As such, the checkpoint of a process can be potentially made much smaller by exploiting application semantics.

The state of a distributed system is usually referred to as the global state of the system [5]. It is not a simple aggregation of the states of the processes in the distributed system because the processes exchange messages with each other, which means that a process may causally depend on some other processes. Such dependency must be preserved in a global state. Assume that each process in the distributed system takes checkpoints periodically, this implies that we may not be able to use the latest set of checkpoints for proper recovery should the processes fails, unless the checkpointing at different processes are coordinated [5]. To see why, considering the three scenarios illustrated in Figure 2.2 where the global state is constructed by using the three checkpoints, C0, C1, C2, taken at processes P0, P1, and P2, respectively.

Figure 2.2(a) shows a scenario in which the checkpoints taken by different processes are incompatible, and hence cannot be used to recover the system upon a failure. Let’s see why. In this scenario, P0 sends a message m0 to P1, and P1 subsequently sends a message m1 to P2. Therefore, the state of P2 potentially depends on the state of P1 after it has received m1, and the state of P1 may depend on that of P0 once it receives m0. The checkpoint C0 is taken before P0 sends the message m0 to P1, whereas the checkpoint C1 is taken after P1 has received m0. The checkpoints are not compatible because C1 reflects the receiving of m0 while C0 does not reflect the sending of m0, that is, the dependency is broken. Similarly, C2 reflects the receiving of m1 while C1 does not reflect the sending of m1.


Figure 2.2 Consistent and inconsistent global state examples.

From Traditional Fault Tolerance to Blockchain

Подняться наверх