Developing learning strategies for topic-based summarization

You Ouyang, Sujian Li, Wenjie Li

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

40 Citations (Scopus)

Abstract

Most up-to-date well-behaved topic-based summarization systems are built upon the extractive framework. They score the sentences based on the associated features by manually assigning or experimentally tuning the weights of the features. In this paper, we discuss how to develop learning strategies in order to obtain the optimal feature weights automatically, which can be used for assigning a sound score to a sentence characterized with a set of features. The two fundamental issues are about training data and learning models. To save the costly manual annotation time and effort, we construct the training data by labeling the sentence with a "true" score calculated according to human summaries. The Support Vector Regression (SVR) model is then used to learn how to relate the "true" score of the sentence to its features. Once the relations have been mathematically modeled, SVR is able to predict the "estimated" score for any given sentence. The evaluations by ROUGE-2 criterion on DUC 2006 and DUC 2005 document sets demonstrate the competitiveness and the adaptability of the proposed approaches.
Original languageEnglish
Title of host publicationCIKM 2007 - Proceedings of the 16th ACM Conference on Information and Knowledge Management
Pages79-86
Number of pages8
DOIs
Publication statusPublished - 1 Dec 2007
Event16th ACM Conference on Information and Knowledge Management, CIKM 2007 - Lisboa, Portugal
Duration: 6 Nov 20079 Nov 2007

Conference

Conference16th ACM Conference on Information and Knowledge Management, CIKM 2007
CountryPortugal
CityLisboa
Period6/11/079/11/07

Keywords

  • Document summarization
  • Support vector regression

ASJC Scopus subject areas

  • Business, Management and Accounting(all)
  • Decision Sciences(all)

Cite this