TY - JOUR
T1 - LORM: Learning to Optimize for Resource Management in Wireless Networks with Few Training Samples
AU - Shen, Yifei
AU - Shi, Yuanming
AU - Zhang, Jun
AU - Letaief, Khaled B.
N1 - Funding Information:
Manuscript received May 4, 2019; revised August 3, 2019; accepted October 3, 2019. Date of publication October 22, 2019; date of current version January 8, 2020. This work was supported in part by the General Research Funding from the Research Grants Council of Hong Kong under Project 16210719, and in part by the National Nature Science Foundation of China under Grant 61601290. This article was presented in part at the IEEE Global Conference on Signal and Information Processing (GlobalSIP), 2018 [1] and the IEEE International Conference on Communications (ICC), 2019 [2]. The associate editor coordinating the review of this article and approving it for publication was X. Cheng. (Corresponding author: Jun Zhang.) Y. Shen and K. B. Letaief are with the Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Hong Kong (e-mail: [email protected]; [email protected]).
Publisher Copyright:
© 2002-2012 IEEE.
PY - 2020/1
Y1 - 2020/1
N2 - Effective resource management plays a pivotal role in wireless networks, which, unfortunately, typically results in challenging mixed-integer nonlinear programming (MINLP) problems. Machine learning-based methods have recently emerged as a disruptive way to obtain near-optimal performance for MINLPs with affordable computational complexity. There have been some attempts in applying such methods to resource management in wireless networks, but these attempts require huge amounts of training samples and lack the capability to handle constrained problems. Furthermore, they suffer from severe performance deterioration when the network parameters change, which commonly happens and is referred to as the task mismatch problem. In this paper, to reduce the sample complexity and address the feasibility issue, we propose a framework of Learning to Optimize for Resource Management (LORM). In contrast to the end-to-end learning approach adopted in previous studies, LORM learns the optimal pruning policy in the branch-and-bound algorithm for MINLPs via a sample-efficient method, namely, imitation learning. To further address the task mismatch problem, we develop a transfer learning method via self-imitation in LORM, named LORM-TL, which can quickly adapt a pre-trained machine learning model to the new task with only a few additional unlabeled training samples. Numerical simulations demonstrate that LORM outperforms specialized state-of-the-art algorithms and achieves near-optimal performance, while providing significant speedup compared with the branch-and-bound algorithm. Moreover, LORM-TL, by relying on a few unlabeled samples, achieves comparable performance with the model trained from scratch with sufficient labeled samples.
AB - Effective resource management plays a pivotal role in wireless networks, which, unfortunately, typically results in challenging mixed-integer nonlinear programming (MINLP) problems. Machine learning-based methods have recently emerged as a disruptive way to obtain near-optimal performance for MINLPs with affordable computational complexity. There have been some attempts in applying such methods to resource management in wireless networks, but these attempts require huge amounts of training samples and lack the capability to handle constrained problems. Furthermore, they suffer from severe performance deterioration when the network parameters change, which commonly happens and is referred to as the task mismatch problem. In this paper, to reduce the sample complexity and address the feasibility issue, we propose a framework of Learning to Optimize for Resource Management (LORM). In contrast to the end-to-end learning approach adopted in previous studies, LORM learns the optimal pruning policy in the branch-and-bound algorithm for MINLPs via a sample-efficient method, namely, imitation learning. To further address the task mismatch problem, we develop a transfer learning method via self-imitation in LORM, named LORM-TL, which can quickly adapt a pre-trained machine learning model to the new task with only a few additional unlabeled training samples. Numerical simulations demonstrate that LORM outperforms specialized state-of-the-art algorithms and achieves near-optimal performance, while providing significant speedup compared with the branch-and-bound algorithm. Moreover, LORM-TL, by relying on a few unlabeled samples, achieves comparable performance with the model trained from scratch with sufficient labeled samples.
KW - few-shot learning
KW - mixed-integer nonlinear programming
KW - Resource allocation
KW - transfer learning
KW - wireless communications
UR - http://www.scopus.com/inward/record.url?scp=85078331151&partnerID=8YFLogxK
U2 - 10.1109/TWC.2019.2947591
DO - 10.1109/TWC.2019.2947591
M3 - Journal article
AN - SCOPUS:85078331151
SN - 1536-1276
VL - 19
SP - 665
EP - 679
JO - IEEE Transactions on Wireless Communications
JF - IEEE Transactions on Wireless Communications
IS - 1
M1 - 8879693
ER -