Abstract
Distributed word representations are very useful for capturing semantic information and have been successfully applied in a variety of NLP tasks, especially on English. In this work, we innovatively develop two component-enhanced Chinese character embedding models and their bigram extensions. Distinguished from English word embeddings, our models explore the compositions of Chinese characters, which often serve as semantic indictors inherently. The evaluations on both word similarity and text classification demonstrate the effectiveness of our models.
Original language | English |
---|---|
Title of host publication | Conference Proceedings - EMNLP 2015 |
Subtitle of host publication | Conference on Empirical Methods in Natural Language Processing |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 829-834 |
Number of pages | 6 |
ISBN (Electronic) | 9781941643327 |
Publication status | Published - 1 Jan 2015 |
Event | Conference on Empirical Methods in Natural Language Processing, EMNLP 2015 - Lisbon, Portugal Duration: 17 Sept 2015 → 21 Sept 2015 |
Conference
Conference | Conference on Empirical Methods in Natural Language Processing, EMNLP 2015 |
---|---|
Country/Territory | Portugal |
City | Lisbon |
Period | 17/09/15 → 21/09/15 |
ASJC Scopus subject areas
- Computational Theory and Mathematics
- Computer Science Applications
- Information Systems