Abstract
It is a fundamental and important task to extract key phrases from documents. Generally, phrases in a document are not independent in delivering the content of the document. In order to capture and make better use of their relationships in key phrase extraction, we suggest exploring the Wikipedia knowledge to model a document as a semantic network, where both n-ary and binary relationships among phrases are formulated. Based on a commonly accepted assumption that the title of a document is always elaborated to reflect the content of a document and consequently key phrases tend to have close semantics to the title, we propose a novel semi-supervised key phrase extraction approach in this paper by computing the phrase importance in the semantic network, through which the influence of title phrases is propagated to the other phrases iteratively. Experimental results demonstrate the remarkable performance of this approach.
Original language | English |
---|---|
Title of host publication | ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference |
Pages | 296-300 |
Number of pages | 5 |
Publication status | Published - 1 Dec 2010 |
Event | 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010 - Uppsala, Sweden Duration: 11 Jul 2010 → 16 Jul 2010 |
Conference
Conference | 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010 |
---|---|
Country/Territory | Sweden |
City | Uppsala |
Period | 11/07/10 → 16/07/10 |
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language