Most current extractive summarization models generate summaries by selecting salient sentences. However, one of the problems with sentence-level extractive summarization is that there exists a gap between the human-written gold summary and the oracle sentence labels. In this paper, we propose to extract fact-level semantic units for better extractive summarization. We also introduce a hierarchical structure, which incorporates the multi-level of granularities of the textual information into the model. In addition, we incorporate our model with BERT using a hierarchical graph mask. This allows us to combine BERT’s ability in natural language understanding and the structural information without increasing the scale of the model. Experiments on the CNN/DaliyMail dataset show that our model achieves state-of-the-art results.
|Title of host publication||The 28th International Conference on Computational Linguistics (COLING’2020)|
|Publication status||Published - 8 Dec 2020|