TY - GEN
T1 - A multi-stage Chinese collocation extraction system
AU - Xu, Ruifeng
AU - Lu, Qin
PY - 2006/7/14
Y1 - 2006/7/14
N2 - Most of the existing collocation extraction systems are based on globally significant statistical behaviors without mechanisms to handle different types of collocations. By taking compositionality, substitutability, modifiability and internal associations into consideration, collocations are categorized into four different types in this work. Based on the analysis for each type of collocation, a multi-stage extraction system is designed using different combinations of discriminative features so as to identify different types of collocations in different stages. Perceptron training is employed to optimize the consolidation of discriminative features from different sources. Experiment results show that the achieved performance is much better than most reported work.
AB - Most of the existing collocation extraction systems are based on globally significant statistical behaviors without mechanisms to handle different types of collocations. By taking compositionality, substitutability, modifiability and internal associations into consideration, collocations are categorized into four different types in this work. Based on the analysis for each type of collocation, a multi-stage extraction system is designed using different combinations of discriminative features so as to identify different types of collocations in different stages. Perceptron training is employed to optimize the consolidation of discriminative features from different sources. Experiment results show that the achieved performance is much better than most reported work.
UR - http://www.scopus.com/inward/record.url?scp=33745781492&partnerID=8YFLogxK
M3 - Conference article published in proceeding or book
SN - 3540335846
SN - 9783540335849
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 740
EP - 749
BT - Advances in Machine Learning and Cybernetics - 4th International Conference, ICMLC 2005, Revised Selected Papers
T2 - 4th International Conference on Machine Learning and Cybernetics, ICMLC 2005
Y2 - 18 August 2005 through 21 August 2005
ER -