Checkpointing and rollback of wide-area distributed applications using mobile agents

Jiannong Cao, G. H. Chan, Weijia Jia, T. S. Dillon

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

38 Citations (Scopus)

Abstract

We consider the problem of designing rollback error recovery algorithms for dynamic, wide area distributed systems like the Internet. The characteristics and the scale of such a system complicate the design and performance of the algorithms. Traditional message passing based algorithms incur large overhead, in both the network traffic and message passing delay, in such a wide-area environment. In this paper, we propose a novel approach to designing checkpointing and rollback algorithms using mobile agents as an aid. Using mobile agent leads to a reduction of the total amount of communication and allows us to design algorithms that take the advantage of the most up to date system information for decision making. It also allows us to develop algorithms implementing flexible and adaptive policies. A mobile agent enabled hybrid algorithm combining independent and coordinated checkpointing is proposed. A prototype of the algorithms is developed using IBM's Aglets. Results of performance evaluation are presented and discussed.
Original languageEnglish
Title of host publicationProceedings - 15th International Parallel and Distributed Processing Symposium, IPDPS 2001
PublisherIEEE
ISBN (Electronic)0769509908, 9780769509907
DOIs
Publication statusPublished - 1 Jan 2001
Event15th International Parallel and Distributed Processing Symposium, IPDPS 2001 - San Francisco, United States
Duration: 23 Apr 200127 Apr 2001

Conference

Conference15th International Parallel and Distributed Processing Symposium, IPDPS 2001
CountryUnited States
CitySan Francisco
Period23/04/0127/04/01

ASJC Scopus subject areas

  • Hardware and Architecture
  • Computer Networks and Communications

Cite this