Abstract
A synchronous checkpointing algorithm coordinates a set of processes in taking checkpoints in such a way that the set of local checkpoints always forms part of a consistent global system state. Whenever a process p requests to take a checkpoint, a set of processes, called the cohorts set of p, must be checked and some of them may also have to take their checkpoints in order to preserve system consistency. Although several synchronous checkpointing algorithms have been proposed in the literature, most of them do not address the performance issue. In this paper we propose an efficient distributed algorithm for synchronous checkpointing. Proof of correctness and analysis of efficiency of the algorithm are presented. It is shown that the algorithm has a better message and time complexity than the existing algorithms. The method proposed in this paper can also be applied to enhance the performance of rollback operation which always require synchronization of the inter-dependent processes.
Original language | English |
---|---|
Title of host publication | Proceedings of the Conference on Advances in Parallel and Distributed Computing |
Publisher | IEEE |
Pages | 261-268 |
Number of pages | 8 |
Publication status | Published - 1 Jan 1997 |
Externally published | Yes |
Event | Proceedings of the 1997 Conference on Advances in Parallel and Distributed Computing - Shanghai, China Duration: 19 Mar 1997 → 21 Mar 1997 |
Conference
Conference | Proceedings of the 1997 Conference on Advances in Parallel and Distributed Computing |
---|---|
Country/Territory | China |
City | Shanghai |
Period | 19/03/97 → 21/03/97 |
ASJC Scopus subject areas
- General Computer Science