Abstract
This paper describes the construction of an English-Chinese Parallel Corpus of wine reviews and elaborates on one of its applications – i.e. an E-C bilingual oenology term bank of wine tasting terms. The corpus is sourced from Decanter China, containing 1211 aligned wine reviews in both English and Chinese with 149,463 Chinese characters and 66,909 English words. It serves as a dataset for investigating crosslingual and cross-cultural differences in describing the sensory properties of wines. Our log-likelihood tests revealed good candidates for the Chinese translations of the English words in wine reviews. One of the most challenging features of this domain-specific bilingual term bank is the dominant many-to-many nature of term mapping. We focused on the one-to-many English-Chinese mapping relations of two major types: (a) the words without a single precise translation (e.g. “palate”) and (b) the words that are underspecified and involve ‘place-holder’ translation (e.g. “aroma”). Our study differs from previous bilingual CompuTerm studies by focusing on an area
where cultural and sensory experiences favour many-to-many mappings instead of the default one-to-one mapping preferred in scientific and jurisprudential areas. This necessity for many-to-many mappings in turn challenges the basic design feature of many state-of-the-art automatic bilingual term-extraction approaches.
where cultural and sensory experiences favour many-to-many mappings instead of the default one-to-one mapping preferred in scientific and jurisprudential areas. This necessity for many-to-many mappings in turn challenges the basic design feature of many state-of-the-art automatic bilingual term-extraction approaches.
Original language | English |
---|---|
Title of host publication | Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation |
Editors | Minh Le Nguyen, Mai Chi Luong, Sanghoun Song |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 318–328 |
Publication status | Published - Oct 2020 |
Event | The 34th Pacific Asia Conference on Language, Information and Computation (PACLIC-34) - Vietnam National University, Hanoi, Viet Nam Duration: 24 Oct 2020 → 26 Oct 2020 |
Conference
Conference | The 34th Pacific Asia Conference on Language, Information and Computation (PACLIC-34) |
---|---|
Country/Territory | Viet Nam |
City | Hanoi |
Period | 24/10/20 → 26/10/20 |