The robustness of domain lexico-taxonomy: Expanding domain lexicon with CiLin

Chu Ren Huang, Xiang Bing Li, Jia Fei Hong

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

2 Citations (Scopus)

Abstract

This paper deals with the robust expansion of Domain Lexico-Taxonomy (DLT). DLT is a domain taxonomy enriched with domain lexica. DLT was proposed as an infrastructure for crossing domain barriers (Huang et al. 2004). The DLT proposal is based on the observation that domain lexica contain entries that are also part of a general lexicon. Hence, when entries of a general lexicon are marked with their associated domain attributes, this information can have two important applications. First, the DLT will serve as seeds for domain lexica. Second, the DLT offers the most reliable evidence for deciding the domain of a new text since these lexical clues belong to the general lexicon and do occur reliably in all texts. Hence general lexicon lemmas are extracted to populate domain lexica, which are situated in domain taxonomy. Based on this previous work, we show in this paper that the original DLT can be further expanded when a new language resource is introduced. We applied CiLin, a Chinese thesaurus, and added more than 1000 new entries for DLT and show with evaluation that the DLT approach is robust since the size and number of domain lexica increased effectively.

Original languageEnglish
Title of host publicationProceedings of the Fourth SIGHAN Workshop on Chinese Language Processing
PublisherAsian Federation of Natural Language Processing
Pages103-109
Number of pages7
Publication statusPublished - 2005
Externally publishedYes
Event4th SIGHAN Workshop on Chinese Language Processing at the 2nd International Joint Conference on Natural Language Processing, SIGHAN@IJCNLP 2005 - Jeju Island, Korea, Republic of
Duration: 14 Oct 200515 Oct 2005

Conference

Conference4th SIGHAN Workshop on Chinese Language Processing at the 2nd International Joint Conference on Natural Language Processing, SIGHAN@IJCNLP 2005
Country/TerritoryKorea, Republic of
CityJeju Island
Period14/10/0515/10/05

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'The robustness of domain lexico-taxonomy: Expanding domain lexicon with CiLin'. Together they form a unique fingerprint.

Cite this