Abstract
Social network like a corpus with valuable data, has attracted much attention from a various fields of researchers in recent years, especially in the subject of big data analytics. However, as the foundation, the part of efficient and accurate data collection has not been focused much in the past published works. During the data among the web increasing rapidly, this article will identify two major challenges that traditional distributed based web crawler systems cannot adapt, which is fast handling the big data in social networks and suiting for multiple web sources with a uniformed collecting model. To deal with these two challenges thus to build a foundation of the big data analytics, this article will propose an Ontology based adapted web crawler system called OACM system, which uses MapReduce model to effectively balance the processing resources thus to fasten the processing speed of the collection procedure and designs a uniformed Ontology model to estimate the semantic content of both social networks and collecting tasks to adapt different web sources. During a set of experiments, the proposed OACM system could optimize the system resource scheduling efficiently and could achieve the task of collecting large amount of data from multiple web sources.
Original language | English |
---|---|
Title of host publication | CCIS 2014 - Proceedings of 2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems |
Publisher | IEEE |
Pages | 530-535 |
Number of pages | 6 |
ISBN (Electronic) | 9781479947201 |
DOIs | |
Publication status | Published - 1 Jan 2014 |
Event | 3rd IEEE International Conference on Cloud Computing and Intelligence Systems, CCIS 2014 - Shenzhen, China Duration: 27 Nov 2014 → 29 Nov 2014 |
Conference
Conference | 3rd IEEE International Conference on Cloud Computing and Intelligence Systems, CCIS 2014 |
---|---|
Country/Territory | China |
City | Shenzhen |
Period | 27/11/14 → 29/11/14 |
Keywords
- Big Data Analytics
- MapReduce
- Ontology Model
- Social Network
- Web Crawler
ASJC Scopus subject areas
- Software
- Information Systems
- Artificial Intelligence
- Computational Theory and Mathematics