Abstract
Both compression and decompression play important roles in a web service system. High compression ratio helps to save the storage, while fast decompression contributes to decreasing the response time of service. Specifically focusing on the news web service, this paper proposes a compression mechanism to improve the efficiency of compression and decompression simultaneously by taking advantage of the semantic relations among webpages. Firstly, webpages are clustered into news topics according to the similar semantic relation among webpages. Webpages belonging to the same topic have much duplicate content, which can improve the compression ratio when using delta-compression. Secondly, associated news topics are detected with the help of multiple-semantic link network of news topics. Associated topics are compressed into the same zip fle which may decrease the times of decompression according to the habit of a user's reading news on the Web. The authors apply the proposed compression mechanism to a practical news search engine and the experimental results show that it has high compression ratio and fast decompression speed as well. Copyright © 2013, IGI Global.
Original language | English |
---|---|
Pages (from-to) | 49-64 |
Number of pages | 16 |
Journal | International Journal of Cognitive Informatics and Natural Intelligence |
Volume | 7 |
Issue number | 2 |
DOIs | |
Publication status | Published - 1 Jan 2013 |
Externally published | Yes |
Keywords
- Compression efficiency
- Decompression speed
- Delta-compression algorithm
- Multiple-semantic link network
- News topics clustering
- Webpages compression
ASJC Scopus subject areas
- Software
- Human-Computer Interaction
- Artificial Intelligence