Learning similarity functions in graph-based document summarization

You Ouyang, Wenjie Li, Furu Wei, Qin Lu

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

2 Citations (Scopus)


Graph-based models have been extensively explored in document summarization in recent years. Compared with traditional feature-based models, graph-based models incorporate interrelated information into the ranking process. Thus, potentially they can do a better job in retrieving the important contents from documents. In this paper, we investigate the problem of how to measure sentence similarity which is a crucial issue in graph-based summarization models but in our belief has not been well defined in the past. We propose a supervised learning approach that brings together multiple similarity measures and makes use of human-generated summaries to guide the combination process. Therefore, it can be expected to provide more accurate estimation than a single cosine similarity measure. Experiments conducted on the DUC2005 and DUC2006 data sets show that the proposed learning approach is successful in measuring similarity. Its competitiveness and adaptability are also demonstrated.
Original languageEnglish
Title of host publicationComputer Processing of Oriental Languages
Subtitle of host publicationLanguage Technology for the Knowledge-based Economy - 22nd International Conference, ICCPOL 2009, Proceedings
Number of pages12
Publication statusPublished - 9 Nov 2009
Event22nd International Conference on Computer Processing of Oriental Languages, ICCPOL 2009 - , Hong Kong
Duration: 26 Mar 200927 Mar 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5459 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference22nd International Conference on Computer Processing of Oriental Languages, ICCPOL 2009
Country/TerritoryHong Kong


  • Document summarization
  • Graph-based ranking
  • Sentence similarity calculation
  • Support vector machine

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this