Chinese terminology extraction using bilingual web resources

Yuhang Yang, Luning Ji, Qin Lu, Tiejun Zhao

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

1 Citation (Scopus)

Abstract

Automatic terminology extraction requires termhood verification for extracted terms in a specific domain. Chinese terminology extraction suffers from insufficient domain corpora for verification even though there is abundance of information in other languages. This paper presents a novel approach to overcome this problem by using word translations and bilingual web resources to improve both coverage and precision. The proposed approach incorporates bilingual information from within candidate terms themselves and from existing domain knowledge to conduct termhood calculation. In contrast to previous researches, this method is not confined to only pre-determined corpora. Preliminary experiments show a 14.8% improvement in coverage and 26.3% improvement in precision, respectively.
Original languageEnglish
Title of host publicationIEEE NLP-KE 2007 - Proceedings of International Conference on Natural Language Processing and Knowledge Engineering
Pages347-354
Number of pages8
DOIs
Publication statusPublished - 1 Dec 2007
EventInternational Conference on Natural Language Processing and Knowledge Engineering, IEEE NLP-KE 2007 - Beijing, China
Duration: 30 Aug 20071 Sep 2007

Conference

ConferenceInternational Conference on Natural Language Processing and Knowledge Engineering, IEEE NLP-KE 2007
CountryChina
CityBeijing
Period30/08/071/09/07

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Information Systems and Management

Cite this