Chinese terminology extraction using window-based contextual information

Luning Ji, Mantai Sum, Qin Lu, Wenjie Li, Yirong Chen

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

12 Citations (Scopus)

Abstract

Terminology extraction is an important work for automatic update of domain specific knowledge. Contextual information helps to decide whether the extracted new terms are terminology or not. As extraction based on fixed patterns has very limited use to handle natural language text, we need both syntactical and semantic information in the context of a term to determine its termhood. In this paper, we investigate two window-based context word extraction methods taking into account of syntactic and semantic information. Based on the performance of each method individually, a hybrid method which combines both syntactical and semantic information is proposed. Experiments show that the hybrid method can achieve significant improvement.
Original languageEnglish
Title of host publicationComputational Linguistics and Intelligent Text Processing - 8th International Conference, CICLing 2007, Proceedings
Pages62-74
Number of pages13
Publication statusPublished - 20 Dec 2007
Event8th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2007 - Mexico City, Mexico
Duration: 18 Feb 200724 Feb 2007

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4394 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference8th Annual Conference on Intelligent Text Processing and Computational Linguistics, CICLing 2007
Country/TerritoryMexico
CityMexico City
Period18/02/0724/02/07

Keywords

  • Chinese terminology
  • Termhood
  • Terminology extraction
  • Unithood
  • Window-based contextual word

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this