The abstractive method and extractive method are two main approaches for automatic document summarization. In this paper, to fully integrate the relatedness and advantages of both approaches, we propose a general unified framework for abstractive summarization which incorporates extractive summarization as an auxiliary task. In particular, our framework is composed of a shared hierarchical document encoder, a hierarchical attention mechanism-based decoder, and an extractor. We adopt multi-task learning method to train these two tasks jointly, which enables the shared encoder to better capture the semantics of the document. Moreover, as our main task is abstractive summarization, we constrain the attention learned in the abstractive task with the labels of the extractive task to strengthen the consistency between the two tasks. Experiments on the CNN/DailyMail dataset demonstrate that both the auxiliary task and the attention constraint contribute to improve the performance significantly, and our model is comparable to the state-of-the-art abstractive models. In addition, we cut half number of labels of the extractive task, pretrain the extractor, and jointly train the two tasks using the estimated sentence salience of the extractive task to constrain the attention of the abstractive task. The results do not decrease much compared with using full-labeled data of the auxiliary task.
- Attention mechanism
- Automatic document summarization
- Multi-task learning
ASJC Scopus subject areas
- Computational Mechanics
- Computer Science Applications