Abstract
In this paper, the role of named entity based patterns is emphasized in measuring the document sentences and topic relevance for topic-focused extractive summarization. Patterns are defined as the informative, semantic-sensitive text bi-grams consisting of at least one named entity or the semantic class of a named entity. They are extracted automatically according to eight pre-specified templates. Question types are also taken into consideration if they are available when dealing with topic questions. To alleviate problems with coverage, pattern and uni-gram models are integrated together to compensate each other in similarity calculation. Automatic ROUGE evaluations indicate that the proposed idea can produce a very good system that tops the best-performing system at Document Understanding Conference (DUC) 2005.
| Original language | English |
|---|---|
| Title of host publication | IEEE NLP-KE 2007 - Proceedings of International Conference on Natural Language Processing and Knowledge Engineering |
| Pages | 111-118 |
| Number of pages | 8 |
| DOIs | |
| Publication status | Published - 1 Dec 2007 |
| Event | International Conference on Natural Language Processing and Knowledge Engineering, IEEE NLP-KE 2007 - Beijing, China Duration: 30 Aug 2007 → 1 Sept 2007 |
Conference
| Conference | International Conference on Natural Language Processing and Knowledge Engineering, IEEE NLP-KE 2007 |
|---|---|
| Country/Territory | China |
| City | Beijing |
| Period | 30/08/07 → 1/09/07 |
ASJC Scopus subject areas
- Computer Science Applications
- Information Systems
- Information Systems and Management