Abstract
This paper discusses the implementation of a knowledge-rich approach to automatic acquisition of grammatical information. Our study is based on Word Sketch Engine (Kilgarriff and Tudgell 2002). The original claims of WSE are two folded: That linguistic generalizations can be automatically extracted from a corpus with simple collocation information provided that the corpus is large enough; and that such a methodology is easily adaptable for a new language. Our work on Chinese Sketch Engine attests to the claim the WSE is adaptable for a new language. More critically, we show that the quality of grammatical information provided has a directly bearing on the result of grammatical information acquisition. We show that when provided with a knowledge rich lexical grammar, both the quantity and quality of the extracted knowledge improves substantially over the results with simple PS rules.
Original language | English |
---|---|
Title of host publication | PACLIC 20 - Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation |
Pages | 206-214 |
Number of pages | 9 |
Publication status | Published - 1 Dec 2006 |
Externally published | Yes |
Event | 20th Pacific Asia Conference on Language, Information and Computation, PACLIC 20 - Wuhan, China Duration: 1 Nov 2006 → 3 Nov 2006 |
Conference
Conference | 20th Pacific Asia Conference on Language, Information and Computation, PACLIC 20 |
---|---|
Country/Territory | China |
City | Wuhan |
Period | 1/11/06 → 3/11/06 |
ASJC Scopus subject areas
- Language and Linguistics
- Computer Science (miscellaneous)