Multi-stage chinese collocation extraction

Rui Feng Xu, Qin Lu

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

3 Citations (Scopus)


Collocation is a recurrent and conventional natural language expression. In this research, Chinese collocations are categorized into four types. Based on the statistical analysis of different types of typical collocations, a multi-stage window-based collocation extraction system is designed, in which lexical statistic, synonyms information, syntactic information, and dependency knowledge, are used to extract n-gram collocations and different types of bi-gram collocations separately. Experimental results show that this system achieves a better precision and recall performance, compared with existed statistical collocation extraction techniques.
Original languageEnglish
Title of host publication2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005
Number of pages6
Publication statusPublished - 12 Dec 2005
EventInternational Conference on Machine Learning and Cybernetics, ICMLC 2005 - Guangzhou, China
Duration: 18 Aug 200521 Aug 2005


ConferenceInternational Conference on Machine Learning and Cybernetics, ICMLC 2005


  • Collocation extraction
  • Multi-stage extraction
  • Natural language processing

ASJC Scopus subject areas

  • Engineering(all)


Dive into the research topics of 'Multi-stage chinese collocation extraction'. Together they form a unique fingerprint.

Cite this