Generating a Structured Summary of Numerous Academic Papers: Dataset and Method

Shuaiqi Liu, Jiannong Cao, Ruosong Yang, Zhiyuan Wen

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

11 Citations (Scopus)

Abstract

Writing a survey paper on one research topic usually needs to cover the salient content from numerous related papers, which can be modeled as a multi-document summarization (MDS) task. Existing MDS datasets usually focus on producing the structureless summary covering a few input documents. Meanwhile, previous structured summary generation works focus on summarizing a single document into a multi-section summary. These existing datasets and methods cannot meet the requirements of summarizing numerous academic papers into a structured summary. To deal with the scarcity of available data, we propose BigSurvey, the first large-scale dataset for generating comprehensive summaries of numerous academic papers on each topic. We collect target summaries from more than seven thousand survey papers and utilize their 430 thousand reference papers' abstracts as input documents. To organize the diverse content from dozens of input documents and ensure the efficiency of processing long text sequences, we propose a summarization method named category-based alignment and sparse transformer (CAST). The experimental results show that our CAST method outperforms various advanced summarization methods.
Original languageEnglish
Title of host publicationProceedings of the Thirty-First International Joint Conference on Artificial Intelligence
Pages4259-4265
Number of pages7
DOIs
Publication statusPublished - Feb 2023
EventTHE 31ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE - Messe Wien, Vienna, Austria
Duration: 23 Jul 202229 Jul 2022
https://ijcai-22.org/

Competition

CompetitionTHE 31ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE
Abbreviated titleIJCAI-ECAI 2022
Country/TerritoryAustria
CityVienna
Period23/07/2229/07/22
Internet address

Fingerprint

Dive into the research topics of 'Generating a Structured Summary of Numerous Academic Papers: Dataset and Method'. Together they form a unique fingerprint.

Cite this