Building a Semantic Transparency Dataset of Chinese Nominal Compounds: A Practice of Crowdsourcing Methodology

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

6 Citations (Scopus)

Abstract

This paper describes the work which aimed to create a semantic transparency dataset of Chinese nominal compounds (SemTransCNC 1.0) by crowdsourcing methodology. We firstly selected about 1,200 Chinese nominal compounds from a lexicon of modern Chinese and the Sinica Corpus. Then through a series of crowdsourcing experiments conducted on the Crowdflower platform, we successfully collected both overall semantic transparency and constituent semantic transparency data for each of them. According to our evaluation, the data quality is good. This work filled a gap in Chinese language resources and also practiced and explored the crowdsourcing methodology for linguistic experiment and language resource construction.

Original languageEnglish
Title of host publicationProceedings of the Workshop on Lexical and Grammatical Resources for Language Processing, LG-LP 2014 - in conjunction with 25th International Conference on Computational Linguistics, COLING 2014
EditorsJorge Baptista, Pushpak Bhattacharyya, Christiane Fellbaum, Mikel Forcada, Chu-Ren Huang, Svetla Koeva, Cvetana Krstev, Eric Laporte
PublisherAssociation for Computational Linguistics (ACL)
Pages147-156
Number of pages10
ISBN (Electronic)9781873769447
Publication statusPublished - 2014
Event2014 Workshop on Lexical and Grammatical Resources for Language Processing, LG-LP 2014 - Dublin, Ireland
Duration: 24 Aug 2014 → …

Publication series

NameProceedings of the Workshop on Lexical and Grammatical Resources for Language Processing, LG-LP 2014 - in conjunction with 25th International Conference on Computational Linguistics, COLING 2014

Conference

Conference2014 Workshop on Lexical and Grammatical Resources for Language Processing, LG-LP 2014
Country/TerritoryIreland
CityDublin
Period24/08/14 → …

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Building a Semantic Transparency Dataset of Chinese Nominal Compounds: A Practice of Crowdsourcing Methodology'. Together they form a unique fingerprint.

Cite this