10
Views
0
CrossRef citations to date
0
Altmetric
Original Articles

Logging based Coordinated Checkpointing in Mobile Distributed Computing Systems

, &
Pages 485-490 | Published online: 26 Mar 2015
 

Abstract

Checkpointing is an efficient way of implementing fault tolerance in distributed systems. Mobile computing raises many new issues, such as high mobility, lack of stable storage on mobile hosts (MHs), low bandwidth of wireless channels, limited battery life and disconnections that make the traditional checkpointing protocols unsuitable to checkpoint such systems. Checkpointing can be independent, synchronous, quasi-synchronous, or message logging based. In synchronous checkpointing, all or interacting processes need to checkpoint synchronously, extra synchronization messages are sent, some information may be piggybacked onto computation messages, blocking of processes may take place, and in case of fault, all processes are forced to rollback. It becomes difficult for multiple MHs to checkpoint synchronously due to disconnections and unreliable wireless channels. MHs are prone to frequent failures, which will require frequent rollback of all processes. In this paper, we propose a hybrid non-intrusive checkpointing protocol, where fixed hosts checkpoint synchronously and MHs checkpoint independently. The proposed scheme gives MHs autonomy in taking checkpoints and reduces the information to be piggybacked onto computation messages. An MH can recover independently by using its recent checkpoint and message log without forcing other nodes to rollback.

Additional information

Notes on contributors

Lalit Kumar

Lalit Kumar received his MTech (Computer Sc & Engg) from IIT Delhi in 1993 and PhD from IIT Roorkee in 2003. He is in the faculty of computer science and Engineering, National Institute of Technology Hamirpur (HP) since 1989. Presently, he is holding the position of Assistant Professor and Head of the Department. His research interests include mobile distributed systems, checkpointing based fault tolerance and Ad hoc Networks. He has contributed more than 35 articles in national and international journals as well as conferences.

Parveen Kumar

Parveen Kumar received his MCA from Kurukshetra University in 1989 and MS (Software Systems) from BITS Pilani in 2001. He is working as Programmer in National Institute of Technology, Hamirpur (HP) since 1991. He is doing his PhD on checkpointing based fault tolerance in mobile distributed systems and contributed more than 15 articles in journals and international conferences.

R K Chauhan

R K Chauhan received doctoral degree in computer science from Kurukshetra University, Kurukshetra, India in 2000. He works in the field of mobile computing, Ad hoc networks, databases and GIS. Since 1989, he has been on the faculty of computer science and applications, Kurukshetra University. He is a member of CSI.

Reprints and Corporate Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

To request a reprint or corporate permissions for this article, please click on the relevant link below:

Academic Permissions

Please note: Selecting permissions does not provide access to the full text of the article, please see our help page How do I view content?

Obtain permissions instantly via Rightslink by clicking on the button below:

If you are unable to obtain permissions via Rightslink, please complete and submit this Permissions form. For more information, please visit our Permissions help page.