Measuring relevance with named entity based patterns in topic-focused document summarization

Furu Wei, Wenjie Li, Yanxiang He

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

In this paper, the role of named entity based patterns is emphasized in measuring the document sentences and topic relevance for topic-focused extractive summarization. Patterns are defined as the informative, semantic-sensitive text bi-grams consisting of at least one named entity or the semantic class of a named entity. They are extracted automatically according to eight pre-specified templates. Question types are also taken into consideration if they are available when dealing with topic questions. To alleviate problems with coverage, pattern and uni-gram models are integrated together to compensate each other in similarity calculation. Automatic ROUGE evaluations indicate that the proposed idea can produce a very good system that tops the best-performing system at Document Understanding Conference (DUC) 2005.
Original languageEnglish
Title of host publicationIEEE NLP-KE 2007 - Proceedings of International Conference on Natural Language Processing and Knowledge Engineering
Pages111-118
Number of pages8
DOIs
Publication statusPublished - 1 Dec 2007
EventInternational Conference on Natural Language Processing and Knowledge Engineering, IEEE NLP-KE 2007 - Beijing, China
Duration: 30 Aug 20071 Sep 2007

Conference

ConferenceInternational Conference on Natural Language Processing and Knowledge Engineering, IEEE NLP-KE 2007
CountryChina
CityBeijing
Period30/08/071/09/07

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Information Systems and Management

Cite this