Component-enhanced Chinese character embeddings

Yanran Li, Wenjie Li, Fei Sun, Sujian Li

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

53 Citations (Scopus)


Distributed word representations are very useful for capturing semantic information and have been successfully applied in a variety of NLP tasks, especially on English. In this work, we innovatively develop two component-enhanced Chinese character embedding models and their bigram extensions. Distinguished from English word embeddings, our models explore the compositions of Chinese characters, which often serve as semantic indictors inherently. The evaluations on both word similarity and text classification demonstrate the effectiveness of our models.
Original languageEnglish
Title of host publicationConference Proceedings - EMNLP 2015
Subtitle of host publicationConference on Empirical Methods in Natural Language Processing
PublisherAssociation for Computational Linguistics (ACL)
Number of pages6
ISBN (Electronic)9781941643327
Publication statusPublished - 1 Jan 2015
EventConference on Empirical Methods in Natural Language Processing, EMNLP 2015 - Lisbon, Portugal
Duration: 17 Sept 201521 Sept 2015


ConferenceConference on Empirical Methods in Natural Language Processing, EMNLP 2015

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems


Dive into the research topics of 'Component-enhanced Chinese character embeddings'. Together they form a unique fingerprint.

Cite this