A spectral analysis approach to document summarization: Clustering and ranking sentences simultaneously

Xiaoyan Cai, Wenjie Li

Research output: Journal article publicationJournal articleAcademic researchpeer-review

32 Citations (Scopus)

Abstract

Automatic document summarization aims to create a compressed summary that preserves the main content of the original documents. It is a well-recognized fact that a document set often covers a number of topic themes with each theme represented by a cluster of highly related sentences. More important, topic themes are not equally important. The sentences in an important theme cluster are generally deemed more salient than the sentences in a trivial theme cluster. Existing clustering-based summarization approaches integrate clustering and ranking in sequence, which unavoidably ignore the interaction between them. In this paper, we propose a novel approach developed based on the spectral analysis to simultaneously clustering and ranking of sentences. Experimental results on the DUC generic summarization datasets demonstrate the improvement of the proposed approach over the other existing clustering-based approaches.
Original languageEnglish
Pages (from-to)3816-3827
Number of pages12
JournalInformation Sciences
Volume181
Issue number18
DOIs
Publication statusPublished - 15 Sep 2011

Keywords

  • Document summarization
  • Sentence clustering
  • Sentence ranking
  • Spectral analysis

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Theoretical Computer Science
  • Software
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

Cite this