Abstract
Collocation is a recurrent and conventional natural language expression. In this research, Chinese collocations are categorized into four types. Based on the statistical analysis of different types of typical collocations, a multi-stage window-based collocation extraction system is designed, in which lexical statistic, synonyms information, syntactic information, and dependency knowledge, are used to extract n-gram collocations and different types of bi-gram collocations separately. Experimental results show that this system achieves a better precision and recall performance, compared with existed statistical collocation extraction techniques.
| Original language | English |
|---|---|
| Title of host publication | 2005 International Conference on Machine Learning and Cybernetics, ICMLC 2005 |
| Pages | 3254-3259 |
| Number of pages | 6 |
| Publication status | Published - 12 Dec 2005 |
| Event | International Conference on Machine Learning and Cybernetics, ICMLC 2005 - Guangzhou, China Duration: 18 Aug 2005 → 21 Aug 2005 |
Conference
| Conference | International Conference on Machine Learning and Cybernetics, ICMLC 2005 |
|---|---|
| Country/Territory | China |
| City | Guangzhou |
| Period | 18/08/05 → 21/08/05 |
Keywords
- Collocation extraction
- Multi-stage extraction
- Natural language processing
ASJC Scopus subject areas
- General Engineering
Fingerprint
Dive into the research topics of 'Multi-stage chinese collocation extraction'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver