TY - JOUR
T1 - Resource-Constrained Edge AI with Early Exit Prediction
AU - Dong, Rongkang
AU - Mao, Yuyi
AU - Zhang, Jun
N1 - Funding Information:
Manuscript received Apr. 20, 2022; revised May 15, 2022; accepted Jun. 10, 2022. This work is supported in part by a start-up fund of the Hong Kong Polytechnic University under Grant P0038174. The associate editor coordinating the review of this paper and approving it for publication was L. Q. Fu.
Publisher Copyright:
© 2022, Posts and Telecom Press Co Ltd. All rights reserved.
PY - 2022/6
Y1 - 2022/6
N2 - By leveraging the data sample diversity, the early-exit network recently emerges as a prominent neural network architecture to accelerate the deep learning inference process. However, intermediate classifiers of the early exits introduce additional computation overhead, which is unfavorable for resource-constrained edge artificial intelligence (AI). In this paper, we propose an early exit prediction mechanism to reduce the on-device computation overhead in a device-edge co-inference system supported by early-exit networks. Specifically, we design a low-complexity module, namely the exit predic-tor, to guide some distinctly “hard” samples to bypass the computation of the early exits. Besides, considering the varying communication bandwidth, we extend the early exit prediction mechanism for latency-aware edge inference, which adapts the prediction thresholds of the exit predictor and the confidence thresholds of the early-exit network via a few simple regression models. Extensive experiment results demonstrate the effective-ness of the exit predictor in achieving a better tradeoff between accuracy and on-device computation overhead for early-exit networks. Besides, compared with the baseline methods, the proposed method for latency-aware edge inference attains higher inference accuracy under different bandwidth conditions.
AB - By leveraging the data sample diversity, the early-exit network recently emerges as a prominent neural network architecture to accelerate the deep learning inference process. However, intermediate classifiers of the early exits introduce additional computation overhead, which is unfavorable for resource-constrained edge artificial intelligence (AI). In this paper, we propose an early exit prediction mechanism to reduce the on-device computation overhead in a device-edge co-inference system supported by early-exit networks. Specifically, we design a low-complexity module, namely the exit predic-tor, to guide some distinctly “hard” samples to bypass the computation of the early exits. Besides, considering the varying communication bandwidth, we extend the early exit prediction mechanism for latency-aware edge inference, which adapts the prediction thresholds of the exit predictor and the confidence thresholds of the early-exit network via a few simple regression models. Extensive experiment results demonstrate the effective-ness of the exit predictor in achieving a better tradeoff between accuracy and on-device computation overhead for early-exit networks. Besides, compared with the baseline methods, the proposed method for latency-aware edge inference attains higher inference accuracy under different bandwidth conditions.
KW - artificial intelligence (AI)
KW - device-edge cooperative inference
KW - early exit prediction
KW - early-exit network
KW - edge AI
UR - http://www.scopus.com/inward/record.url?scp=85134069844&partnerID=8YFLogxK
U2 - 10.23919/JCIN.2022.9815196
DO - 10.23919/JCIN.2022.9815196
M3 - Journal article
AN - SCOPUS:85134069844
SN - 2096-1081
VL - 7
SP - 122
EP - 134
JO - Journal of Communications and Information Networks
JF - Journal of Communications and Information Networks
IS - 2
ER -