Cluster frameworks for efficient scheduling and resource allocation in data center networks: A survey

Kun Wang, Qihua Zhou, Song Guo, Jiangtao Luo

Research output: Journal article publicationReview articleAcademic researchpeer-review

20 Citations (Scopus)

Abstract

Data centers are widely used for big data analytics, which often involve data-parallel jobs, including query and web service. Meanwhile, cluster frameworks are rapidly developed for data-intensive applications in data center networks (DCNs). To promote the performance of these frameworks, many efforts have been paid to improve scheduling strategies and resource allocation algorithms. With the deployment of geo-distributed data centers and data-intensive applications, the optimization in DCNs regains pervasive attention in both industry and academia. Many solutions, such as the coflow-aware scheduling and speculative execution, have been proposed to meet various requirements. Therefore, we present a solid starting ground and comprehensive overview in this area to help readers quickly understand state-of-the-art technologies and research progress. We observe that algorithms in cluster frameworks are implemented with different guidelines and can be classified according to scheduling granularity, controller management, and prior-knowledge requirement. In addition, mechanisms for conquering crucial challenges in DCNs are discussed, including providing low latency and minimizing job completion time. Moreover, we analyze desirable properties of fault tolerance and scalability to illuminate the design principles of distributed systems. We hope that this paper will shed light on this promising land and serve as a guide for further researches.

Original languageEnglish
Article number8416689
Pages (from-to)3560-3580
Number of pages21
JournalIEEE Communications Surveys and Tutorials
Volume20
Issue number4
DOIs
Publication statusPublished - 1 Oct 2018

Keywords

  • Big data
  • Cluster frameworks
  • Coflow
  • Data center networks
  • Data-parallel jobs
  • Distributed systems
  • Resource allocation
  • Scheduling

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this