Multi-stage chinese collocation extraction

Rui Feng Xu, Qin Lu

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

3 Citations (Scopus)

Abstract

Collocation is a recurrent and conventional natural language expression. In this research, Chinese collocations are categorized into four types. Based on the statistical analysis of different types of typical collocations, a multi-stage window-based collocation extraction system is designed, in which lexical statistic, synonyms information, syntactic information, and dependency knowledge, are used to extract n-gram collocations and different types of bi-gram collocations separately. Experimental results show that this system achieves a better precision and recall performance, compared with existed statistical collocation extraction techniques.
Original languageEnglish
Title of host publication2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005
Pages3254-3259
Number of pages6
Publication statusPublished - 12 Dec 2005
EventInternational Conference on Machine Learning and Cybernetics, ICMLC 2005 - Guangzhou, China
Duration: 18 Aug 200521 Aug 2005

Conference

ConferenceInternational Conference on Machine Learning and Cybernetics, ICMLC 2005
CountryChina
CityGuangzhou
Period18/08/0521/08/05

Keywords

  • Collocation extraction
  • Multi-stage extraction
  • Natural language processing

ASJC Scopus subject areas

  • Engineering(all)

Cite this