A clustering based approach for domain relevant relation extraction

Yuhang Yang, Qin Lu, Tiejun Zhao

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

3 Citations (Scopus)

Abstract

Most existing corpus based relation extraction techniques focus on predefined relations. In this paper, a clustering based method is presented for domain relevant relation extraction including both relation type discovery and relation instance extraction. Given two raw corpora, one in the general domain, one in an application domain, domain specific verbs connecting different instances are extracted based on syntactic dependency as well as a small set of domain concept instance seeds. Relation types are then discovered based on verb clustering followed by relation instance extraction. The proposed approach requires no predefined relation types, no prior training of domain knowledge, and no need for manually annotated corpora. This method is applicable to any domain corpus and it is especially useful for knowledge-limited and resource-limited domains. Evaluations conducted on Chinese football domain for relation extraction show that the approach discovers various relations with good performance.
Original languageEnglish
Title of host publication2008 International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2008
DOIs
Publication statusPublished - 1 Dec 2008
Event2008 International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2008 - Beijing, China
Duration: 19 Oct 200822 Oct 2008

Conference

Conference2008 International Conference on Natural Language Processing and Knowledge Engineering, NLP-KE 2008
CountryChina
CityBeijing
Period19/10/0822/10/08

Keywords

  • Domain verb extraction
  • Information extraction
  • Relation extraction
  • Relation type discovery
  • Verb clustering

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software

Cite this