Exploiting word internal structures for generic Chinese sentence representation

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

Abstract

We introduce a novel mixed character-word architecture to improve Chinese sentence representations, by utilizing rich semantic information of word internal structures. Our architecture uses two key strategies. The first is a mask gate on characters, learning the relation among characters in a word. The second is a max-pooling operation on words, adaptively finding the optimal mixture of the atomic and compositional word representations. Finally, the proposed architecture is applied to various sentence composition models, which achieves substantial performance gains over baseline models on sentence similarity task.

Original languageEnglish
Title of host publicationEMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings
EditorsMartha Palmer, Rebecca Hwa, Sebastian Riedel
PublisherAssociation for Computational Linguistics (ACL)
Pages298-303
Number of pages6
ISBN (Electronic)9781945626838
DOIs
Publication statusPublished - Sept 2017
Externally publishedYes
Event2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017 - Copenhagen, Denmark
Duration: 9 Sept 201711 Sept 2017

Publication series

NameEMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017
Country/TerritoryDenmark
CityCopenhagen
Period9/09/1711/09/17

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Exploiting word internal structures for generic Chinese sentence representation'. Together they form a unique fingerprint.

Cite this