TY - GEN
T1 - Traffic-aware task placement with guaranteed job completion time for geo-distributed big data
AU - Li, Peng
AU - Miyazaki, Toshiaki
AU - Guo, Song
PY - 2017/7/28
Y1 - 2017/7/28
N2 - Big data analysis is usually casted into parallel jobs running on geo-distributed data centers. Different from a single data center, geo-distributed environment imposes big challenges for big data analytics due to the limited network bandwidth between data centers located in different regions. Although research efforts have been devoted to geo-distributed big data, the results are still far from being efficient because of their suboptimal performance or high complexity. In this paper, we propose a traffic-aware task placement to minimize job completion time of big data jobs. We formulate the problem as a non-convex optimization problem and design an algorithm to solve it with proved performance gap. Finally, extensive simulations are conducted to evaluate the performance of our proposal. The simulation results show that our algorithm can reduce job completion time by 40%, compared to a conventional approach that aggregates all data for centralized processing. Meanwhile, it has only 10% performance gap with the optimal solution, but its problem-solving time is extremely small.
AB - Big data analysis is usually casted into parallel jobs running on geo-distributed data centers. Different from a single data center, geo-distributed environment imposes big challenges for big data analytics due to the limited network bandwidth between data centers located in different regions. Although research efforts have been devoted to geo-distributed big data, the results are still far from being efficient because of their suboptimal performance or high complexity. In this paper, we propose a traffic-aware task placement to minimize job completion time of big data jobs. We formulate the problem as a non-convex optimization problem and design an algorithm to solve it with proved performance gap. Finally, extensive simulations are conducted to evaluate the performance of our proposal. The simulation results show that our algorithm can reduce job completion time by 40%, compared to a conventional approach that aggregates all data for centralized processing. Meanwhile, it has only 10% performance gap with the optimal solution, but its problem-solving time is extremely small.
UR - http://www.scopus.com/inward/record.url?scp=85028359163&partnerID=8YFLogxK
U2 - 10.1109/ICC.2017.7996541
DO - 10.1109/ICC.2017.7996541
M3 - Conference article published in proceeding or book
AN - SCOPUS:85028359163
T3 - IEEE International Conference on Communications
BT - 2017 IEEE International Conference on Communications, ICC 2017
A2 - Debbah, Merouane
A2 - Gesbert, David
A2 - Mellouk, Abdelhamid
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE International Conference on Communications, ICC 2017
Y2 - 21 May 2017 through 25 May 2017
ER -