Abstract
In this paper, the role of named entity based patterns is emphasized in measuring the document sentences and topic relevance for topic-focused extractive summarization. Patterns are defined as the informative, semantic-sensitive text bi-grams consisting of at least one named entity or the semantic class of a named entity. They are extracted automatically according to eight pre-specified templates. Question types are also taken into consideration if they are available when dealing with topic questions. To alleviate problems with coverage, pattern and uni-gram models are integrated together to compensate each other in similarity calculation. Automatic ROUGE evaluations indicate that the proposed idea can produce a very good system that tops the best-performing system at Document Understanding Conference (DUC) 2005.
Original language | English |
---|---|
Title of host publication | IEEE NLP-KE 2007 - Proceedings of International Conference on Natural Language Processing and Knowledge Engineering |
Pages | 111-118 |
Number of pages | 8 |
DOIs | |
Publication status | Published - 1 Dec 2007 |
Event | International Conference on Natural Language Processing and Knowledge Engineering, IEEE NLP-KE 2007 - Beijing, China Duration: 30 Aug 2007 → 1 Sept 2007 |
Conference
Conference | International Conference on Natural Language Processing and Knowledge Engineering, IEEE NLP-KE 2007 |
---|---|
Country/Territory | China |
City | Beijing |
Period | 30/08/07 → 1/09/07 |
ASJC Scopus subject areas
- Computer Science Applications
- Information Systems
- Information Systems and Management