Rollback algorithm and crash recovery based on fault-sensitive graphs

Ying Liu, Daoxu Chen, Li Xie, Jiannong Cao

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

Extended graph-oriented distributed programming model (ExGOM) provides a system architecture to support dynamic configuration. Dynamic configuration involves system expansion and shrink during execution, upgrading while running, and reconfiguration after a fault occurs. One of the problems in reconfiguration is how to recover the system to the original states that exist just before the occurrence of faults. An asynchronous rollback algorithm and a crash recovery mechanism based on fault-sensitive graphs are proposed. The issue of multiple faulty processes on a single transient faulty host is addressed. Compared with other asynchronous rollback and recovery algorithms, the algorithm could make the region of faults localized and the only fault-sensitive nodes are rolled back. This results in a minimized system overhead.
Original languageChinese (Simplified)
Pages (from-to)235-239
Number of pages5
JournalRuan Jian Xue Bao/Journal of Software
Volume11
Issue number2
Publication statusPublished - 1 Feb 2000

ASJC Scopus subject areas

  • Software

Cite this