TY - GEN
T1 - Online internet traffic measurement and monitoring using spark streaming
AU - Zhou, Baojun
AU - Li, Jie
AU - Guo, Song
AU - Wu, Jinsong
AU - Hu, Yongqiang
AU - Zhu, Lihua
PY - 2018/1/10
Y1 - 2018/1/10
N2 - Due to the explosive growth of Internet traffic, network operators must be able to monitor the whole network situations and manage their network resources in an efficient way. Traditional network analysis method that works on a single machine are no longer suitable for this huge traffic data due to its poor processing ability. Some big data frameworks, such as Hadoop and Spark, can handle such analysis job even for large network traffic, but they are inherently designed for offline data analysis. In this paper, we treat the online network analysis as a stream analysis problem and use Spark Streaming to cope with the high-speed Internet traffic data in real time. The system consists of two parts, collector and stream processor. Firstly, several collectors capture network traffic data from switches through mirrored ports and send the packet information to a central stream processor which is a cluster running Spark Streaming. Then, the stream processor analyzes the input data streams and calculates Internet performance metrics. We take TCP performance monitoring as an example to show how network measurement can be done using the stream processing platform. Finally, we conducted typical experiments in a cluster of 3 computers with the standalone mode, showing that our system performs well in huge Internet traffic measurement and monitoring.
AB - Due to the explosive growth of Internet traffic, network operators must be able to monitor the whole network situations and manage their network resources in an efficient way. Traditional network analysis method that works on a single machine are no longer suitable for this huge traffic data due to its poor processing ability. Some big data frameworks, such as Hadoop and Spark, can handle such analysis job even for large network traffic, but they are inherently designed for offline data analysis. In this paper, we treat the online network analysis as a stream analysis problem and use Spark Streaming to cope with the high-speed Internet traffic data in real time. The system consists of two parts, collector and stream processor. Firstly, several collectors capture network traffic data from switches through mirrored ports and send the packet information to a central stream processor which is a cluster running Spark Streaming. Then, the stream processor analyzes the input data streams and calculates Internet performance metrics. We take TCP performance monitoring as an example to show how network measurement can be done using the stream processing platform. Finally, we conducted typical experiments in a cluster of 3 computers with the standalone mode, showing that our system performs well in huge Internet traffic measurement and monitoring.
UR - http://www.scopus.com/inward/record.url?scp=85046396574&partnerID=8YFLogxK
U2 - 10.1109/GLOCOM.2017.8255000
DO - 10.1109/GLOCOM.2017.8255000
M3 - Conference article published in proceeding or book
T3 - 2017 IEEE Global Communications Conference, GLOBECOM 2017 - Proceedings
SP - 1
EP - 6
BT - 2017 IEEE Global Communications Conference, GLOBECOM 2017 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE Global Communications Conference, GLOBECOM 2017
Y2 - 4 December 2017 through 8 December 2017
ER -