Abstract
Information ordering is a nontrivial task in multi-document summarization (MDS), which typically relies on the traditional vector space model (VSM) notorious for semantic deficiency. In this article, we propose a novel event-enriched VSM to alleviate the problem by building event semantics into sentence representations. The mediation of event information between sentence and term, especially in the news domain, has an intuitive appeal as well as technical advantage in common sentence-level operations such as sentence similarity computation. Inspired by the block-style writing by humans, we base the sentence ordering algorithm on sentence clustering. To accommodate the complexity introduced by event information, we adopt a soft-to-hard clustering strategy on the event and sentence levels, using expectation-maximization clustering and K-means, respectively. For the purpose of cluster-based sentence ordering, the event-enriched VSM enables us to design an ordering algorithm to enhance event coherence computed between sentence and sentence-context pairs. Drawing on the findings of earlier research, we also incorporate topic continuity measures and time information into the scheme. We evaluate the performance of the model and its variants automatically and manually, with experimental results showing clear advantage of the event-based model over baseline and non-event-based models in information ordering for multi-document news summarization. We are confident that the event-enriched VSM has even greater potential in summarization and beyond, which awaits further research.
Original language | English |
---|---|
Pages (from-to) | 323-351 |
Number of pages | 29 |
Journal | Computational Intelligence |
Volume | 32 |
Issue number | 2 |
DOIs | |
Publication status | Published - 1 May 2016 |
Keywords
- coherence
- event
- MDS ordering
- two-layered clustering
- vector space model
ASJC Scopus subject areas
- Computational Mathematics
- Artificial Intelligence