Abstract
Automatic Chinese irony detection is a challenging task, and it has a strong impact on linguistic research. However, Chinese irony detection often lacks labeled benchmark datasets. In this paper, we introduce Ciron, the first Chinese
benchmark dataset available for irony detection for machine learning models. Ciron includes more than 8.7K posts, collected from Weibo, a micro blogging platform. Most importantly, Ciron is collected with no pre-conditions to ensure a much wider coverage. Evaluation on seven different machine learning classifiers proves the usefulness of Ciron as an important resource for Chinese irony detection.
benchmark dataset available for irony detection for machine learning models. Ciron includes more than 8.7K posts, collected from Weibo, a micro blogging platform. Most importantly, Ciron is collected with no pre-conditions to ensure a much wider coverage. Evaluation on seven different machine learning classifiers proves the usefulness of Ciron as an important resource for Chinese irony detection.
Original language | English |
---|---|
Title of host publication | Proceedings of the Twelfth Language Resources and Evaluation Conference |
Editors | Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis |
Publisher | European Language Resources Association (ELRA) |
Pages | 5714-5720 |
Publication status | Published - May 2020 |
Event | 12th International Conference on Language Resources and Evaluation, LREC 2020 - Marseille, France Duration: 11 May 2020 → 16 May 2020 |
Conference
Conference | 12th International Conference on Language Resources and Evaluation, LREC 2020 |
---|---|
Country/Territory | France |
City | Marseille |
Period | 11/05/20 → 16/05/20 |