TY - GEN
T1 - Faithful to the original
T2 - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
AU - Cao, Ziqiang
AU - Wei, Furu
AU - Li, Wenjie
AU - Li, Sujian
PY - 2018/1/1
Y1 - 2018/1/1
N2 - Unlike extractive summarization, abstractive summarization has to fuse different parts of the source text, which inclines to create fake facts. Our preliminary study reveals nearly 30% of the outputs from a state-of-the-art neural summarization system suffer from this problem. While previous abstractive summarization approaches usually focus on the improvement of informativeness, we argue that faithfulness is also a vital prerequisite for a practical abstractive summarization system. To avoid generating fake facts in a summary, we leverage open information extraction and dependency parse technologies to extract actual fact descriptions from the source text. The dual-attention sequence-to-sequence framework is then proposed to force the generation conditioned on both the source text and the extracted fact descriptions. Experiments on the Gigaword benchmark dataset demonstrate that our model can greatly reduce fake summaries by 80%. Notably, the fact descriptions also bring significant improvement on informativeness since they often condense the meaning of the source text.
AB - Unlike extractive summarization, abstractive summarization has to fuse different parts of the source text, which inclines to create fake facts. Our preliminary study reveals nearly 30% of the outputs from a state-of-the-art neural summarization system suffer from this problem. While previous abstractive summarization approaches usually focus on the improvement of informativeness, we argue that faithfulness is also a vital prerequisite for a practical abstractive summarization system. To avoid generating fake facts in a summary, we leverage open information extraction and dependency parse technologies to extract actual fact descriptions from the source text. The dual-attention sequence-to-sequence framework is then proposed to force the generation conditioned on both the source text and the extracted fact descriptions. Experiments on the Gigaword benchmark dataset demonstrate that our model can greatly reduce fake summaries by 80%. Notably, the fact descriptions also bring significant improvement on informativeness since they often condense the meaning of the source text.
UR - http://www.scopus.com/inward/record.url?scp=85055595104&partnerID=8YFLogxK
M3 - Conference article published in proceeding or book
AN - SCOPUS:85055595104
T3 - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
SP - 4784
EP - 4791
BT - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
PB - AAAI press
Y2 - 2 February 2018 through 7 February 2018
ER -