27
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

Analytical study of migration-enhanced fault tolerance for long-running applications in IFR systems

&
Pages 409-426 | Received 14 Jan 2008, Accepted 14 Jan 2008, Published online: 11 Sep 2008
 

Abstract

Computer systems with increasing failure rate (IFR) are common in practice. For such systems, the literature indicates that aperiodic checkpointing can provide better performance than periodic checkpointing due to its adaptability to the failure process. However, for long-running applications, aperiodic checkpointing suffers from substantial operational overhead due to frequent checkpointing operations as the application proceeds. To address this problem, in this paper, we propose to incorporate just-in-time process migration in addition to aperiodic checkpointing for applications running in an IFR system. The goal is to reduce application execution time in the presence of failures. In particular, we present an analytical study of this migration-enhanced fault tolerance scheme (denoted as migCP) by deriving application completion time by using migCP and further determining the optimal migration locations. We demonstrate, through analytical modelling and empirical studies, that migCP outperforms aperiodic checkpointing under a variety of system parameters.

Acknowledgements

This work is supported in part by the US National Science Foundation grants CNS-0720549, CCF-0702737, NGS-0406328, and a TeraGrid Computer Allocation.

Notes

Additional information

Notes on contributors

Yawei Li

1. 1. [email protected]

Log in via your institution

Log in to Taylor & Francis Online

PDF download + Online access

  • 48 hours access to article PDF & online version
  • Article PDF can be downloaded
  • Article PDF can be printed
USD 61.00 Add to cart

Issue Purchase

  • 30 days online access to complete issue
  • Article PDFs can be downloaded
  • Article PDFs can be printed
USD 763.00 Add to cart

* Local tax will be added as applicable

Related Research

People also read lists articles that other readers of this article have read.

Recommended articles lists articles that we recommend and is powered by our AI driven recommendation engine.

Cited by lists all citing articles based on Crossref citations.
Articles with the Crossref icon will open in a new tab.