A multi-stage Chinese collocation extraction system

Ruifeng Xu, Qin Lu

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

7 Citations (Scopus)


Most of the existing collocation extraction systems are based on globally significant statistical behaviors without mechanisms to handle different types of collocations. By taking compositionality, substitutability, modifiability and internal associations into consideration, collocations are categorized into four different types in this work. Based on the analysis for each type of collocation, a multi-stage extraction system is designed using different combinations of discriminative features so as to identify different types of collocations in different stages. Perceptron training is employed to optimize the consolidation of discriminative features from different sources. Experiment results show that the achieved performance is much better than most reported work.
Original languageEnglish
Title of host publicationAdvances in Machine Learning and Cybernetics - 4th International Conference, ICMLC 2005, Revised Selected Papers
Number of pages10
Publication statusPublished - 14 Jul 2006
Event4th International Conference on Machine Learning and Cybernetics, ICMLC 2005 - Guangzhou, China
Duration: 18 Aug 200521 Aug 2005

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume3930 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference4th International Conference on Machine Learning and Cybernetics, ICMLC 2005

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'A multi-stage Chinese collocation extraction system'. Together they form a unique fingerprint.

Cite this