TY - GEN
T1 - Building a Semantic Transparency Dataset of Chinese Nominal Compounds
T2 - 2014 Workshop on Lexical and Grammatical Resources for Language Processing, LG-LP 2014
AU - Wang, Shichang
AU - Huang, Chu Ren
AU - Yao, Yao
AU - Chan, Angel
N1 - Funding Information:
The work described in this paper was supported by grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. 544011 & 543512).
Publisher Copyright:
© COLING 2014. All rights reserved.
PY - 2014
Y1 - 2014
N2 - This paper describes the work which aimed to create a semantic transparency dataset of Chinese nominal compounds (SemTransCNC 1.0) by crowdsourcing methodology. We firstly selected about 1,200 Chinese nominal compounds from a lexicon of modern Chinese and the Sinica Corpus. Then through a series of crowdsourcing experiments conducted on the Crowdflower platform, we successfully collected both overall semantic transparency and constituent semantic transparency data for each of them. According to our evaluation, the data quality is good. This work filled a gap in Chinese language resources and also practiced and explored the crowdsourcing methodology for linguistic experiment and language resource construction.
AB - This paper describes the work which aimed to create a semantic transparency dataset of Chinese nominal compounds (SemTransCNC 1.0) by crowdsourcing methodology. We firstly selected about 1,200 Chinese nominal compounds from a lexicon of modern Chinese and the Sinica Corpus. Then through a series of crowdsourcing experiments conducted on the Crowdflower platform, we successfully collected both overall semantic transparency and constituent semantic transparency data for each of them. According to our evaluation, the data quality is good. This work filled a gap in Chinese language resources and also practiced and explored the crowdsourcing methodology for linguistic experiment and language resource construction.
UR - http://www.scopus.com/inward/record.url?scp=84967219589&partnerID=8YFLogxK
M3 - Conference article published in proceeding or book
AN - SCOPUS:84967219589
T3 - Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing, LG-LP 2014 - in conjunction with 25th International Conference on Computational Linguistics, COLING 2014
SP - 147
EP - 156
BT - Proceedings of the Workshop on Lexical and Grammatical Resources for Language Processing, LG-LP 2014 - in conjunction with 25th International Conference on Computational Linguistics, COLING 2014
A2 - Baptista, Jorge
A2 - Bhattacharyya, Pushpak
A2 - Fellbaum, Christiane
A2 - Forcada, Mikel
A2 - Huang, Chu-Ren
A2 - Koeva, Svetla
A2 - Krstev, Cvetana
A2 - Laporte, Eric
PB - Association for Computational Linguistics (ACL)
Y2 - 24 August 2014
ER -