Abstract
This paper presents an annotated Chinese collocation bank developed at the Hong Kong Polytechnic University. The definition of collocation with good linguistic consistency and good computational operability is first discussed and the properties of collocations are then presented. Secondly, based on the combination of different properties, collocations are classified into four types. Thirdly, t he annotation guideline is presented. Fourthly, the implementation issues for collocation bank construction are addressed including the annotation with categorization, dependency and contextual information. Currently, the collocation bank is completed for 3,643 headwords in a 5-million-word corpus.
Original language | English |
---|---|
Title of host publication | Proceedings of the 5th International Conference on Language Resources and Evaluation, LREC 2006 |
Publisher | European Language Resources Association (ELRA) |
Pages | 1880-1885 |
Number of pages | 6 |
Publication status | Published - 1 Jan 2006 |
Event | 5th International Conference on Language Resources and Evaluation, LREC 2006 - Genoa, Italy Duration: 22 May 2006 → 28 May 2006 |
Conference
Conference | 5th International Conference on Language Resources and Evaluation, LREC 2006 |
---|---|
Country/Territory | Italy |
City | Genoa |
Period | 22/05/06 → 28/05/06 |
ASJC Scopus subject areas
- Education
- Library and Information Sciences
- Linguistics and Language
- Language and Linguistics