Abstract
Automatic document summarization aims to create a compressed summary that preserves the main content of the original documents. It is a well-recognized fact that a document set often covers a number of topic themes with each theme represented by a cluster of highly related sentences. More important, topic themes are not equally important. The sentences in an important theme cluster are generally deemed more salient than the sentences in a trivial theme cluster. Existing clustering-based summarization approaches integrate clustering and ranking in sequence, which unavoidably ignore the interaction between them. In this paper, we propose a novel approach developed based on the spectral analysis to simultaneously clustering and ranking of sentences. Experimental results on the DUC generic summarization datasets demonstrate the improvement of the proposed approach over the other existing clustering-based approaches.
Original language | English |
---|---|
Pages (from-to) | 3816-3827 |
Number of pages | 12 |
Journal | Information Sciences |
Volume | 181 |
Issue number | 18 |
DOIs | |
Publication status | Published - 15 Sept 2011 |
Keywords
- Document summarization
- Sentence clustering
- Sentence ranking
- Spectral analysis
ASJC Scopus subject areas
- Control and Systems Engineering
- Theoretical Computer Science
- Software
- Computer Science Applications
- Information Systems and Management
- Artificial Intelligence