Effectual Non-Intrusive Minimum-Process Synchronous Checkpointing Protocol for Mobile Distributed Systems

Authors

  • Raman Kumar  Research Scholar, Desh Bhagat University Mandi Gobindgarh, Punjab, India
  • Er Jyoti Arora   Research Scholar, Desh Bhagat University Mandi Gobindgarh, Punjab, India

DOI:

https://doi.org//10.32628/IJSRST218427

Keywords:

Mobile Computing Systems, coordinated checkpointing, Consistent Checkpoints, Global Snapshot, Recovery.

Abstract

While dealing with Mobile Distributed systems, we come across some issues like: mobility, low bandwidth of wireless channels and dearth of stable storage on mobile nodes, disconnections, inadequate battery power and high failure rate of mobile nodes. Minimum-process coordinated checkpointing is considered an attractive methodology to introduce fault tolerance in mobile systems transparently. In this paper, we propose a non-blocking coordinated global state compilation algorithm for mobile computing systems, which requires only a minimum number of processes to take permanent recovery points. We reduce the communication complexity as compared to the Cao-Singhal algorithm [4], while keeping the number of useless recovery points unchanged. Finally, the paper presents an optimization technique, which significantly reduces the number of useless recovery points at the cost of minor increase in the communication complexity. In coordinated global state compilation, if a single process fails to take its tentative recovery point; all the recovery point effort is aborted. We try to reduce this effort by taking soft recovery points in the first phase at Mobile Hosts.

References

  1. K.M. Chandy and L.Lamport. “Distributed Snapshots: Determining Global States of Distributed Systems” ACM Transactions Computer systems vol. 3, no.1.pp.63- 75, Feb.1985
  2. Prakash R. and Singhal M., “Low-Cost Checkpointing and Failure Recovery in Mobile Computing Systems” ,IEEE Transaction On Parallel and Distributed Systems, vol. 7, no. 10, pp. 1035-1048, October1996.
  3. Guohong Cao and Mukesh Singhal, “On Coordinated Checkpointing in Distributed Systems” IEEE Transaction On Parallel and Distributed Systems, vol. 9, no. 12, pp. 1213-1224, December 1998.
  4. Guohong Cao and Mukesh Singhal, “Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing Systems”, IEEE Transaction On Parallel and Distributed Systems, vol. 12, no. 2, pp. 157- 171, February 2001.
  5. Weigang Ni, Susan V. Vrbsky and Sibabrata Ray “Pitfalls in Distributed Non blocking Checkpointing”, University of Alabama
  6. Prakash R. and Singhal M. “Maximal Global Snapshot with concurrent initiators,” Proc. Sixth IEEE Symp. Parallel and Distributed Processing, pp.344-351, Oct.1994.
  7. Koo. R. and S.Toueg. “Checkpointing and Rollback- Recovery for Distributed Systems” .IEEE Transactions on Software Engineering, SE-13(1):23-31, January 1987.
  8. Bidyut Gupta, S.Rahimi and Z.Lui. “A New High Performance Checkpointing Approach for Mobile Computing Systems”. IJCSNS International Journal of Computer Science and Network Security, Vol.6 No.5B, May 2006.
  9. Acharya A. and Badrinath B. R., “Checkpointing Distributed Applications on Mobile Computers,” Proceedings of the 3rd International Conference on Parallel and Distributed Information Systems, pp. 73-80, September,1994.
  10. Ch.D.V. Subba Rao and M.M.Naidu. “A New, Efficient Coordinated Checkpointing Protocol Combined with Selective Sender-Based Message Logging”.
  11. Nuno Neves and W. Kent Fuchs. “Adaptive Recovery for Mobile Environments”,in Proc.IEEE High-Assurance Systems Engineering Workshop,October 21- 22,1996,pp.134-141.
  12. Y.Manable. “A Distributed Consistent Global Checkpoint Algorithm With minimum number of Checkpoints”. Technical Report of IEICE, COMP97-6(April1997).
  13. J.L.Kim and T.Park. “An efficient protocol for checkpointing recovery in Distributed Systems” IEEE Transaction On Parallel and Distributed Systems,4(8):pp.955-960, Aug 1993.
  14. Elnozahy E.N., Alvisi L., Wang Y.M. and Johnson D.B., “Survey of Rollback-Recovery Protocols in Message- Passing Systems,” ACM Computing Surveys, vol. 34, no. 3, pp. 375-408, 2002.
  15. S.Venkatesan and T.T.-Y.Juang , “ Low Overhead Optimistic Crash Recovery:”, Preliminary version appears in Proc. 11th Int’l Conf. Distributed Computing Systems as “Crash Recovery with Little Overhead,”pp.454- 461, 1991.
  16. Parveen Kumar, Lalit Kumar, R K Chauhan, “A Non-intrusive Hybrid Synchronous Checkpointing Protocol for Mobile Systems”, IETE Journal of Research, Vol. 52 No. 2&3, 2006.
  17. J.L. Kim, T. Park, “ An efficient Protocol for checkpointing Recovery in Distributed Systems,” IEEE Trans. Parallel and Distributed Systems, pp.955-960,Aug.1993.
  18. Mansouri, H., Pathan, A-S.K.: Review of checkpointing and rollback recovery protocols for mobile distributed computing systems. In: Ghosh, U., Rawat D.B., Datta, R., Pathan, A-S.K (eds.) Internet of Things and Secure Smart Environments: Successes and Pitfalls, CRC Press, Taylor & Francis Group (2020).
  19. Mansouri, H., Pathan, A.-S.K.: Checkpointing distributed application running on mobile Ad Hoc networks. Int. J. High Perform. Comput. Networking 11(2), 95–107 (2018).
  20. Mansouri, H., Pathan, A.-S.: A resilient hierarchical checkpointing algorithm for distributed systems running on cluster federation. In: Thampi, S.M., Martinez Perez, G., Ko, R., Rawat, D.B. (eds.) SSCC 2019. CCIS, vol. 1208, pp. 99–110. Springer, Singapore (2020).

Downloads

Published

2021-08-30

Issue

Section

Research Articles

How to Cite

[1]
Raman Kumar, Er Jyoti Arora , " Effectual Non-Intrusive Minimum-Process Synchronous Checkpointing Protocol for Mobile Distributed Systems, International Journal of Scientific Research in Science and Technology(IJSRST), Online ISSN : 2395-602X, Print ISSN : 2395-6011, Volume 8, Issue 4, pp.325-331, July-August-2021. Available at doi : https://doi.org/10.32628/IJSRST218427