Simultaneous Clustering and Noise Detection for Theme-based Summarization

Xiaoyan Cai, Renxian Zhang, Dehong Gao, Wenjie Li

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

1 Citation (Scopus)

Abstract

Multi-document summarization aims to produce a concise summary that contains salient information from a set of source documents. Since documents often cover a number of topical themes with each theme represented by a cluster of highly related sentences, sentence clustering plays a pivotal role in theme-based summarization. Moreover, noting that real-world datasets always contain noises which inevitably degrade the clustering performance, we incorporate noise detection with spectral clustering to generate ordinary sentence clusters and one noise sentence cluster. We are also interested in making the theme-based summaries biased towards a user's query. The effectiveness of the proposed approaches is demonstrated by both the cluster quality analysis and the summarization evaluation conducted on the DUC generic and query-oriented summarization datasets.

Original languageEnglish
Title of host publicationIJCNLP 2011 - Proceedings of the 5th International Joint Conference on Natural Language Processing
EditorsHaifeng Wang, David Yarowsky
PublisherAssociation for Computational Linguistics (ACL)
Pages491-499
Number of pages9
ISBN (Electronic)9789744665645
Publication statusPublished - 2011
Event5th International Joint Conference on Natural Language Processing, IJCNLP 2011 - Chiang Mai, Thailand
Duration: 8 Nov 201113 Nov 2011

Publication series

NameIJCNLP 2011 - Proceedings of the 5th International Joint Conference on Natural Language Processing

Conference

Conference5th International Joint Conference on Natural Language Processing, IJCNLP 2011
Country/TerritoryThailand
CityChiang Mai
Period8/11/1113/11/11

ASJC Scopus subject areas

  • Language and Linguistics
  • Artificial Intelligence
  • Software
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Simultaneous Clustering and Noise Detection for Theme-based Summarization'. Together they form a unique fingerprint.

Cite this