TY - GEN
T1 - Communication-Computation Efficient Device-Edge Co-Inference via AutoML
AU - Zhang, Xinjie
AU - Shao, Jiawei
AU - Mao, Yuyi
AU - Zhang, Jun
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/12
Y1 - 2021/12
N2 - Device-edge co-inference, which partitions a deep neural network between a resource-constrained mobile device and an edge server, recently emerges as a promising paradigm to support intelligent mobile applications. To accelerate the in-ference process, on-device model sparsification and intermediate feature compression are regarded as two prominent techniques. However, as the on-device model sparsity level and intermediate feature compression ratio have direct impacts on computation workload and communication overhead respectively, and both of them affect the inference accuracy, finding the optimal values of these hyper-parameters brings a major challenge due to the large search space. In this paper, we endeavor to develop an efficient algorithm to determine these hyper-parameters. By selecting a suitable model split point and a pair of encoder/decoder for the intermediate feature vector, this problem is casted as a sequential decision problem, for which, a novel automated machine learning (AutoML) framework is proposed based on deep reinforcement learning (DRL). Experiment results on an image classification task demonstrate the effectiveness of the proposed framework in achieving a better communication-computation trade-off and significant inference speedup against various baseline schemes.
AB - Device-edge co-inference, which partitions a deep neural network between a resource-constrained mobile device and an edge server, recently emerges as a promising paradigm to support intelligent mobile applications. To accelerate the in-ference process, on-device model sparsification and intermediate feature compression are regarded as two prominent techniques. However, as the on-device model sparsity level and intermediate feature compression ratio have direct impacts on computation workload and communication overhead respectively, and both of them affect the inference accuracy, finding the optimal values of these hyper-parameters brings a major challenge due to the large search space. In this paper, we endeavor to develop an efficient algorithm to determine these hyper-parameters. By selecting a suitable model split point and a pair of encoder/decoder for the intermediate feature vector, this problem is casted as a sequential decision problem, for which, a novel automated machine learning (AutoML) framework is proposed based on deep reinforcement learning (DRL). Experiment results on an image classification task demonstrate the effectiveness of the proposed framework in achieving a better communication-computation trade-off and significant inference speedup against various baseline schemes.
KW - automated machine learning (AutoML)
KW - communication-computation trade-off
KW - deep neural network (DNN)
KW - deep reinforce-ment learning (DRL)
KW - Device-edge co-inference
UR - http://www.scopus.com/inward/record.url?scp=85127250536&partnerID=8YFLogxK
U2 - 10.1109/GLOBECOM46510.2021.9685432
DO - 10.1109/GLOBECOM46510.2021.9685432
M3 - Conference article published in proceeding or book
AN - SCOPUS:85127250536
T3 - 2021 IEEE Global Communications Conference, GLOBECOM 2021 - Proceedings
SP - 1
EP - 6
BT - 2021 IEEE Global Communications Conference, GLOBECOM 2021 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE Global Communications Conference, GLOBECOM 2021
Y2 - 7 December 2021 through 11 December 2021
ER -