Abstract
Cross-lingual sentiment classification aims to perform sentiment classification in a language (named as the target language) with the help of the resources from another language (named as the source language). Previous studies are prone to using all available data in the source language while using all data is observed to perform no better or even worse than using a partion of good data. In this paper, we propose a novel task called data quality controlling in the source language to select high quality samples from the source language. To tackle this task, we propose two kinds of data quality measurements: intra- and extra-quality measurements which are implemented with the certainty and similarity measurements respectively. The empirical studies demonstrate the effectiveness of the proposed approach to data quality controlling in the source language.
Original language | English |
---|---|
Title of host publication | Proceedings - 2013 International Conference on Asian Language Processing, IALP 2013 |
Pages | 125-128 |
Number of pages | 4 |
DOIs | |
Publication status | Published - 1 Dec 2013 |
Event | 2013 International Conference on Asian Language Processing, IALP 2013 - Urumqi, Xinjiang, China Duration: 17 Aug 2013 → 19 Aug 2013 |
Conference
Conference | 2013 International Conference on Asian Language Processing, IALP 2013 |
---|---|
Country/Territory | China |
City | Urumqi, Xinjiang |
Period | 17/08/13 → 19/08/13 |
ASJC Scopus subject areas
- Software