TY - JOUR
T1 - Adaptive Task-Based Intermittent Computing System With Parallel State Backup
AU - Zhang, Wei
AU - Zhang, Qianling
AU - Lv, Mingsong
AU - Liu, Songran
AU - Zhou, Zimeng
AU - Chen, Qiulin
AU - Guan, Nan
AU - Ju, Lei
N1 - Funding Information:
This work was supported in part by the Natural Science Foundation of China under Grant 92064008 and Grant 62102230; in part by the Shandong Provincial Natural Science Foundation under Grant ZR2022QF003 and Grant ZR2021QF019; in part by the CCF-Huawei Populus Grove Fund; in part by the Fund from the Key Laboratory of Dependable Service Computing in Cyber Physical Society, China, under Grant CPSDSC202208; in part by the Taishan Scholars Program; and in part by the Qilu Young Scholar Program of Shandong University.
Publisher Copyright:
© 1982-2012 IEEE.
PY - 2023/6/1
Y1 - 2023/6/1
N2 - Energy harvesting promises to power billions of Internet of Things devices without being restricted by battery life. Since the energy harvester generally outputs weak and unstable energy, the system may suffer frequent and unpredictable power failures, thus falling into cyclically reboots without forward progress. The task-based intermittent computing system which periodically backs up system states into nonvolatile memory (NVM) is proposed to solve the nonprogress problem, with the nontrivial cost of frequent backups. How to reduce the backup overhead becomes a major research problem for intermittent computing. This article, for the first time, proposes to parallelize state backup and program execution with asynchronous direct memory access (DMA) to hide the backup latency into the program's execution. But, straightforwardly executing the state backup and the program in parallel may cause an inconsistent system state. In specific, the system state may be modified by the program during backup, and therefore may be backed up incorrectly and further cause the system to deliver an incorrect computation result. We make a deep analysis on the system behavior and observe that, although the system state may be backed up incorrectly, the incorrect backup will be covered by the subsequent correct backups soon as the backup operations are performed frequently. In addition, only a small part of variables among all the program states may cause incorrect computation result. So, in this article, we aggressively allow incorrect backups to occur and propose a backup error detection method and a fault-tolerant backup management to guarantee the correctness of the system's execution. To augment the parallel backup method, an adaptive execution method is further proposed to reduce the number of backups and balance the ratio between task execution time and backup latency. We design a run-time system to implement the proposed approach, and experimental results conducted on an STM32F7-based platform show that the proposed method can achieve a $2.6\times $ average speedup.
AB - Energy harvesting promises to power billions of Internet of Things devices without being restricted by battery life. Since the energy harvester generally outputs weak and unstable energy, the system may suffer frequent and unpredictable power failures, thus falling into cyclically reboots without forward progress. The task-based intermittent computing system which periodically backs up system states into nonvolatile memory (NVM) is proposed to solve the nonprogress problem, with the nontrivial cost of frequent backups. How to reduce the backup overhead becomes a major research problem for intermittent computing. This article, for the first time, proposes to parallelize state backup and program execution with asynchronous direct memory access (DMA) to hide the backup latency into the program's execution. But, straightforwardly executing the state backup and the program in parallel may cause an inconsistent system state. In specific, the system state may be modified by the program during backup, and therefore may be backed up incorrectly and further cause the system to deliver an incorrect computation result. We make a deep analysis on the system behavior and observe that, although the system state may be backed up incorrectly, the incorrect backup will be covered by the subsequent correct backups soon as the backup operations are performed frequently. In addition, only a small part of variables among all the program states may cause incorrect computation result. So, in this article, we aggressively allow incorrect backups to occur and propose a backup error detection method and a fault-tolerant backup management to guarantee the correctness of the system's execution. To augment the parallel backup method, an adaptive execution method is further proposed to reduce the number of backups and balance the ratio between task execution time and backup latency. We design a run-time system to implement the proposed approach, and experimental results conducted on an STM32F7-based platform show that the proposed method can achieve a $2.6\times $ average speedup.
KW - Adaptive execution
KW - asynchronous direct memory access (DMA)
KW - intermittent computing
KW - state backup
UR - http://www.scopus.com/inward/record.url?scp=85139853714&partnerID=8YFLogxK
U2 - 10.1109/TCAD.2022.3213989
DO - 10.1109/TCAD.2022.3213989
M3 - Journal article
AN - SCOPUS:85139853714
SN - 0278-0070
VL - 42
SP - 1798
EP - 1809
JO - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
JF - IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
IS - 6
ER -