TY - JOUR
T1 - Data-Importance Aware User Scheduling for Communication-Efficient Edge Machine Learning
AU - Liu, Dongzhu
AU - Zhu, Guangxu
AU - Zhang, Jun
AU - Huang, Kaibin
N1 - Funding Information:
Manuscript received October 5, 2019; revised February 21, 2020 and April 16, 2020; accepted May 28, 2020. Date of publication June 3, 2020; date of current version March 8, 2021. The work of G. Zhu was supported by National Key R&D Program of China (No. 2018YFB1800800), Guangdong Province Key Area R&D Program (No. 2018B030338001), Leading Talents of Guangdong Province Program (No. 00201501), and Shenzhen Peacock Plan (No. KQTD2015033114415450). The work of J. Zhang was supported in part by a start-up fund of The Hong Kong Polytechnic University under Project ID P0013883. The work of K. Huang was supported by Hong Kong Research Grants Council under Grants 17208319 and 17209917, Innovation and Technology Fund under Grant GHP/016/18GD, and Guangdong Basic and Applied Basic Research Foundation under Grant 2019B1515130003. Part of this work has been presented in IEEE ICC workshop 2020. The associate editor coordinating the review of this article and approving it for publication was K. Zeng. (Corresponding author: Kaibin Huang.) Dongzhu Liu was with the Department of Electrical and Electronic Engineering, University of Hong Kong, Hong Kong. She is now with the Department of Engineering, King’s College London, London WC2R 2LS, U.K. (e-mail: [email protected]).
Publisher Copyright:
© 2015 IEEE.
PY - 2021/3
Y1 - 2021/3
N2 - With the prevalence of intelligent mobile applications, edge learning is emerging as a promising technology for powering fast intelligence acquisition for edge devices from distributed data generated at the network edge. One critical task of edge learning is to efficiently utilize the limited radio resource to acquire data samples for model training at an edge server. In this paper, we develop a novel user scheduling algorithm for data acquisition in edge learning, called (data) importance-aware scheduling. A key feature of this scheduling algorithm is that it takes into account the informativeness of data samples, besides communication reliability. Specifically, the scheduling decision is based on a data importance indicator (DII), elegantly incorporating two 'important' metrics from communication and learning perspectives, i.e., the signal-to-noise ratio (SNR) and data uncertainty. We first derive an explicit expression for this indicator targeting the classic classifier of support vector machine (SVM), where the uncertainty of a data sample is measured by its distance to the decision boundary. Then, the result is extended to convolutional neural networks (CNN) by replacing the distance based uncertainty measure with the entropy. As demonstrated via experiments using real datasets, the proposed importance-aware scheduling can exploit the two-fold multi-user diversity, namely the diversity in both the multiuser channels and the distributed data samples. This leads to faster model convergence than the conventional scheduling schemes that exploit only a single type of diversity.
AB - With the prevalence of intelligent mobile applications, edge learning is emerging as a promising technology for powering fast intelligence acquisition for edge devices from distributed data generated at the network edge. One critical task of edge learning is to efficiently utilize the limited radio resource to acquire data samples for model training at an edge server. In this paper, we develop a novel user scheduling algorithm for data acquisition in edge learning, called (data) importance-aware scheduling. A key feature of this scheduling algorithm is that it takes into account the informativeness of data samples, besides communication reliability. Specifically, the scheduling decision is based on a data importance indicator (DII), elegantly incorporating two 'important' metrics from communication and learning perspectives, i.e., the signal-to-noise ratio (SNR) and data uncertainty. We first derive an explicit expression for this indicator targeting the classic classifier of support vector machine (SVM), where the uncertainty of a data sample is measured by its distance to the decision boundary. Then, the result is extended to convolutional neural networks (CNN) by replacing the distance based uncertainty measure with the entropy. As demonstrated via experiments using real datasets, the proposed importance-aware scheduling can exploit the two-fold multi-user diversity, namely the diversity in both the multiuser channels and the distributed data samples. This leads to faster model convergence than the conventional scheduling schemes that exploit only a single type of diversity.
KW - data acquisition
KW - image classification
KW - multiuser channels
KW - resource management
KW - Scheduling
UR - http://www.scopus.com/inward/record.url?scp=85096315566&partnerID=8YFLogxK
U2 - 10.1109/TCCN.2020.2999606
DO - 10.1109/TCCN.2020.2999606
M3 - Journal article
AN - SCOPUS:85096315566
SN - 2332-7731
VL - 7
SP - 265
EP - 278
JO - IEEE Transactions on Cognitive Communications and Networking
JF - IEEE Transactions on Cognitive Communications and Networking
IS - 1
M1 - 9107235
ER -