Читать книгу From Traditional Fault Tolerance to Blockchain - Wenbing Zhao - Страница 61
2.3.2.1 Data Structures
ОглавлениеIn the sender-based message logging protocol, each process must maintain the following data structures:
◾ A counter, seq_counter, used to assign a sequence number (using the current value of the counter) to each outgoing (application) message. The counter is initialized to 0 and incremented by one for each message sent. The sequence number is needed for duplicate detection (at the receiving process).
◾ A table used to carry out duplicate detection on incoming messages. The table consists of a collection of entries, one for each process with which the current one communicates. Each entry has the form <process_id,max_seq>, where max_seq is the maximum sequence number that the current process has received from a process with an identifier of process_id. A message is deemed as a duplicate if it carries a sequence number lower or equal to max_seq for the corresponding process.
◾ Another counter, rsn_counter, used to record the receiving/execution order of an incoming message. The counter is initialized to 0 and incremented by one for each message received. The receiving order of a message is represented by the current value of the counter and it is sent back to the sending process of the message for logging.
◾ A message log (in volatile memory) for messages sent by the process. In addition to the message sent, the following meta data is also recorded for each message:- Destination process id, receiver_id;- Sending sequence number, seq;- Receiving sequence number, rsn.The destination process id, the sending sequence number, and the message will be logged prior to the sending of the message. However, the receiving order number will be logged after the process receives such information later.
◾ A history list for the messages received since the last checkpoint. Each entry in the list has the following information regarding each message received:- Sending process id, sender_id;- Sending sequence number, seq;- Receiving sequence number, rsn (assigned by the current process).The history list is used to find the receiving order number for a duplicate message received. Upon receiving a duplicate message, the process should supply the corresponding (original) receiving order number so that the sender of the message can log such ordering information properly.
All the data structures described above except the history list must be checkpointed together with the process state. The two counters, one for assigning the message sequence number and the other for assigning the message receiving order, are needed so that the process can continue doing so upon recovery using the checkpoint. The table for duplicate detection is needed for a similar reason. However, the saving of the message log as part of the checkpoint might appear to be counter-intuitive because a major benefit of doing checkpointing is to truncate the message log (i.e., garbage collect logged messages) for (receiver-based) pessimistic logging as described in section 2.3.1. For sender-based message logging, unfortunately this side benefit is no longer applicable. The message log is needed for the receiving processes to recover from a failure, and hence, cannot be garbage collected upon a checkpointing operation. Additional mechanism, which will be introduced towards the end of this section, is necessary to ensure that the message log does not grow indefinitely.
The reason why the history list can be garbage collected upon a checkpointing operation is because the receiving sequence number information in the list (i.e., the receiving/execution order of the messages leading to the checkpoint) will no longer be needed for failure recovery. When a process receives a duplicate message and it cannot find the corresponding receiving sequence number in the history list because it has recently checkpointed its state, it may inform the sender that the message can now be purged from its message log – it is no longer needed for failure recovery due to the recent checkpoint.
In addition to the above data structures, the protocol uses the following types of messages:
◾ REGULAR message type. It is used for sending regular messages generated by the application process, and it has the form <REGULAR,seq,rsn,m>, where m refers to the message content. Obviously, at the time of sending of a message, its receiving sequence number, rsn, would not be known to the sending process, in which case, it assumes a special constant value (such as -1) indicating the unknown status. When a logged message is replayed to a recovering process, the sending process might have already learned the rsn value, in which case, a concrete rsn value is supplied.
◾ ORDER message type. It is used for the receiving process is notify the sending process the receiving/execution order of the message. An ORDER message carries the form <ORDER, [m], rsn>, where [m] is the message identifier consisting of a tuple <sender_id, receiver_ id, seq>.
◾ ACK message type. It is used for the sending process (of a regular message) to acknowledge the receipt of the ORDER message. It assumes the form <ACK, [m]>.