Abstract
Collocation is a recurrent and conventional natural language expression. In this research, Chinese collocations are categorized into four types. Based on the statistical analysis of different types of typical collocations, a multi-stage window-based collocation extraction system is designed, in which lexical statistic, synonyms information, syntactic information, and dependency knowledge, are used to extract n-gram collocations and different types of bi-gram collocations separately. Experimental results show that this system achieves a better precision and recall performance, compared with existed statistical collocation extraction techniques.
Original language | English |
---|---|
Title of host publication | 2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005 |
Pages | 3254-3259 |
Number of pages | 6 |
Publication status | Published - 12 Dec 2005 |
Event | International Conference on Machine Learning and Cybernetics, ICMLC 2005 - Guangzhou, China Duration: 18 Aug 2005 → 21 Aug 2005 |
Conference
Conference | International Conference on Machine Learning and Cybernetics, ICMLC 2005 |
---|---|
Country/Territory | China |
City | Guangzhou |
Period | 18/08/05 → 21/08/05 |
Keywords
- Collocation extraction
- Multi-stage extraction
- Natural language processing
ASJC Scopus subject areas
- General Engineering