Query-oriented unsupervised multi-document summarization via deep learning model

Sheng Hua Zhong, Yan Liu, Bin Li, Jing Long

Research output: Journal article publicationJournal articleAcademic researchpeer-review

71 Citations (Scopus)

Abstract

Abstract Capturing the compositional process from words to documents is a key challenge in natural language processing and information retrieval. Extractive style query-oriented multi-document summarization generates a summary by extracting a proper set of sentences from multiple documents based on pre-given query. This paper proposes a novel document summarization framework based on deep learning model, which has been shown outstanding extraction ability in many real-world applications. The framework consists of three parts: concepts extraction, summary generation, and reconstruction validation. A new query-oriented extraction technique is proposed to extract information distributed in multiple documents. Then, the whole deep architecture is fine-tuned by minimizing the information loss in reconstruction validation. According to the concepts extracted from deep architecture layer by layer, dynamic programming is used to seek most informative set of sentences for the summary. Experiment on three benchmark datasets (DUC 2005, 2006, and 2007) assess and confirm the effectiveness of the proposed framework and algorithms. Experiment results show that the proposed method outperforms state-of-the-art extractive summarization approaches. Moreover, we also provide the statistical analysis of query words based on Amazon's Mechanical Turk (MTurk) crowdsourcing platform. There exists underlying relationships from topic words to the content which can contribute to summarization task.
Original languageEnglish
Article number10053
Pages (from-to)8146-8155
Number of pages10
JournalExpert Systems with Applications
Volume42
Issue number21
DOIs
Publication statusPublished - 18 Jul 2015

Keywords

  • Deep learning
  • Multi-document
  • Neocortex simulation
  • Query-oriented summarization

ASJC Scopus subject areas

  • General Engineering
  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Query-oriented unsupervised multi-document summarization via deep learning model'. Together they form a unique fingerprint.

Cite this