Improving the compression efficiency for news web service using semantic relations among webpages

X. Wei, X. Luo, Qing Li

Research output: Journal article publicationJournal articleAcademic researchpeer-review

2 Citations (Scopus)

Abstract

Both compression and decompression play important roles in a web service system. High compression ratio helps to save the storage, while fast decompression contributes to decreasing the response time of service. Specifically focusing on the news web service, this paper proposes a compression mechanism to improve the efficiency of compression and decompression simultaneously by taking advantage of the semantic relations among webpages. Firstly, webpages are clustered into news topics according to the similar semantic relation among webpages. Webpages belonging to the same topic have much duplicate content, which can improve the compression ratio when using delta-compression. Secondly, associated news topics are detected with the help of multiple-semantic link network of news topics. Associated topics are compressed into the same zip fle which may decrease the times of decompression according to the habit of a user's reading news on the Web. The authors apply the proposed compression mechanism to a practical news search engine and the experimental results show that it has high compression ratio and fast decompression speed as well. Copyright © 2013, IGI Global.
Original languageEnglish
Pages (from-to)49-64
Number of pages16
JournalInternational Journal of Cognitive Informatics and Natural Intelligence
Volume7
Issue number2
DOIs
Publication statusPublished - 1 Jan 2013
Externally publishedYes

Keywords

  • Compression efficiency
  • Decompression speed
  • Delta-compression algorithm
  • Multiple-semantic link network
  • News topics clustering
  • Webpages compression

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Improving the compression efficiency for news web service using semantic relations among webpages'. Together they form a unique fingerprint.

Cite this