TY - JOUR
T1 - Accurate and fast recovery of network monitoring data: A GPU accelerated matrix completion
AU - Xie, Kun
AU - Xie, Kun
AU - Xie, Kun
AU - Chen, Yuxiang
AU - Wang, Xin
AU - Xie, Gaogang
AU - Xie, Gaogang
AU - Cao, Jiannong
AU - Wen, Jigang
N1 - Funding Information:
Manuscript received October 26, 2016; revised June 21, 2017, December 11, 2017, June 27, 2018, and June 6, 2019; accepted January 23, 2020; approved by IEEE/ACM TRANSACTIONS ON NETWORKING Editor K. Argyraki. Date of publication March 19, 2020; date of current version June 18, 2020. This work was supported in part by the National Natural Science Foundation of China under Grant 61972144, Grant 61572184, Grant 61725206, and Grant 61976087, in part by the Hunan Provincial Natural Science Foundation of China under Grant 2017JJ1010, in part by the U.S. NSF under Grant ECCS 78929 and Grant CNS 1526843, in part by the Open Project Funding of State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, under Grant CARCH201809, and in part by the Peng Cheng Laboratory Project of Guangdong Province under Grant PCL2018KP004. (Corresponding author: Yuxiang Chen.) Kun Xie is with the College of Computer Science and Electronics Engineering, Hunan University, Changsha 410082, China, also with the Cyberspace Security Research Center, Peng Cheng Laboratory, Shenzhen 201315, China, and also with the Purple Mountain Laboratories, Nanjing 210000, China (e-mail: [email protected]).
Publisher Copyright:
© 1993-2012 IEEE.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/6
Y1 - 2020/6
N2 - Gaining a full knowledge of end-to-end network performance is important for some advanced network management and services. Although it becomes increasingly critical, end-to-end network monitoring usually needs active probing of the path and the overhead will increase quadratically with the number of network nodes. To reduce the measurement overhead, matrix completion is proposed recently to predict the end-to-end network performance among all node pairs by only measuring a small set of paths. Despite its potential, applying matrix completion to recover the missing data suffers from low recovery accuracy and long recovery time. To address the issues, we propose MC-GPU to exploit Graphics Processing Units (GPUs) to enable parallel matrix factorization for high-speed and highly accurate Matrix Completion. To well exploit the special architecture features of GPUs for both task independent and data-independent parallel task execution, we propose several novel techniques: similar OD (origin and destination) pairs reordering taking advantage of the locality-sensitive hash (LSH) functions, balanced matrix partition, and parallel matrix completion. We implement the proposed MC-GPU on the GPU platform and evaluate the performance using real trace data. We compare the proposed MC-GPU with the state of the art matrix completion algorithms, and our results demonstrate that MC-GPU can achieve significantly faster speed with high data recovery accuracy.
AB - Gaining a full knowledge of end-to-end network performance is important for some advanced network management and services. Although it becomes increasingly critical, end-to-end network monitoring usually needs active probing of the path and the overhead will increase quadratically with the number of network nodes. To reduce the measurement overhead, matrix completion is proposed recently to predict the end-to-end network performance among all node pairs by only measuring a small set of paths. Despite its potential, applying matrix completion to recover the missing data suffers from low recovery accuracy and long recovery time. To address the issues, we propose MC-GPU to exploit Graphics Processing Units (GPUs) to enable parallel matrix factorization for high-speed and highly accurate Matrix Completion. To well exploit the special architecture features of GPUs for both task independent and data-independent parallel task execution, we propose several novel techniques: similar OD (origin and destination) pairs reordering taking advantage of the locality-sensitive hash (LSH) functions, balanced matrix partition, and parallel matrix completion. We implement the proposed MC-GPU on the GPU platform and evaluate the performance using real trace data. We compare the proposed MC-GPU with the state of the art matrix completion algorithms, and our results demonstrate that MC-GPU can achieve significantly faster speed with high data recovery accuracy.
KW - GPU
KW - locality-sensitive hash
KW - Parallel matrix completion
UR - http://www.scopus.com/inward/record.url?scp=85086903615&partnerID=8YFLogxK
U2 - 10.1109/TNET.2020.2976129
DO - 10.1109/TNET.2020.2976129
M3 - Journal article
AN - SCOPUS:85086903615
SN - 1063-6692
VL - 28
SP - 958
EP - 971
JO - IEEE/ACM Transactions on Networking
JF - IEEE/ACM Transactions on Networking
IS - 3
M1 - 9042872
ER -