A paralleled big data algorithm with mapreduce framework for mining twitter data

Li Bing, Chun Chung Chan

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

Some recent studies have suggested that public opinions expressed in social media may be correlated with various social issues. To find out what actually can be discovered in social media data, we need data mining. Data mining approaches that can handle massive amount of data have recently been referred to as big data algorithms. In this paper, we propose a big data algorithm to handling Twitter data mining. Furthermore, to ensure scalability, MapReduce framework is adopted to parallelize the proposed algorithm. Through the experiments, the potential of the proposed algorithm can be demonstrated. Computationally, the speed of execution can be shown to increase significantly despite increases in data set size. In fact, the acceleration ratio increases as the size of the dataset increases, and as the number of Data Nodes increases.
Original languageEnglish
Title of host publicationProceedings - 4th IEEE International Conference on Big Data and Cloud Computing, BDCloud 2014 with the 7th IEEE International Conference on Social Computing and Networking, SocialCom 2014 and the 4th International Conference on Sustainable Computing and Communications, SustainCom 2014
PublisherIEEE
Pages121-128
Number of pages8
ISBN (Electronic)9781479967193
DOIs
Publication statusPublished - 1 Jan 2015
Event4th IEEE International Conference on Big Data and Cloud Computing, BDCloud 2014 - Sydney, Australia
Duration: 3 Dec 20145 Dec 2014

Conference

Conference4th IEEE International Conference on Big Data and Cloud Computing, BDCloud 2014
Country/TerritoryAustralia
CitySydney
Period3/12/145/12/14

Keywords

  • big data algorithm
  • data mining
  • MapReduce
  • social media
  • Twitter

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Cite this