38
Views
4
CrossRef citations to date
0
Altmetric
Original Articles

An enhanced model-based checkpointing protocol for preventing useless checkpoints

&
Pages 383-406 | Received 09 Aug 2007, Accepted 23 Oct 2008, Published online: 18 Sep 2009
 

Abstract

Checkpointing and rollback recovery are widely used techniques to handle failures in distributed computing systems. If there is no coordination among processes during checkpointing, processes may take useless checkpoints. Useless checkpoints are checkpoints that cannot be part of any consistent global checkpoint. In this paper, we propose a Communication-Induced checkpointing algorithm that prevents useless checkpoints by directing processes to take forced checkpoints more efficiently whenever a communication pattern that may lead to a Z-Cycle (ZC) is observed. Existence of ZC among checkpoints is known to be necessary and sufficient for making a checkpoint useless. The basic idea behind our algorithm can be extended to existing model-based checkpointing algorithms to reduce the number of forced checkpoints. We also compare the performance of our algorithm with an existing well-known algorithm.

Acknowledgements

A preliminary version of this paper [Citation16] has been presented in the 25th International Conference on Parallel and Distributed Computing and Networking. This material is based in part upon work supported by the US National science Foundation under Grant No. IIS-0414791. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Notes

Additional information

Notes on contributors

Jiang Wu

1. 1. [email protected]

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.