Sense prediction study : two corpus-driven linguistic approaches

J.F. Hong, S.J. Ker, K. Ahrens, Chu-ren Huang

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

In this study, we propose to use two corpus-driven linguistic approaches for a sense prediction study. We will concentrate on the character similarity clustering approach and the concept similarity clustering approach to predict the senses of non-assigned words by using corpora and tools, such as the Chinese Gigaword Corpus and HowNet. In this study, we will evaluate sense predictions via the sense divisions of Chinese Wordnet (CWN) and Xiandai Hanyu Cidian (Xian Han). Using these corpora, we will determine the clusters of our four target words — chi1 "eat", wan2 "play", huan4 "change", and shao1 "burn" — in order to predict all possible senses and then evaluate them. This process will demonstrate the viability of the corpus-based approaches.
Original languageEnglish
Pages (from-to)229-241
Number of pages13
JournalInternational journal of computer processing of languages
Volume23
Issue number3
DOIs
Publication statusPublished - 2011

Keywords

  • Lexical ambiguity
  • Sense prediction
  • Corpus-based approach
  • Character similarity clustering approach
  • Concept similarity clustering approach
  • Evaluation

Cite this