Combining Contextual Information by Self-attention Mechanism in Convolutional Neural Networks for Text Classification

Xin Wu, Yi Cai, Qing Li, Jingyun Xu, Ho fung Leung

Research output: Chapter in book / Conference proceedingConference article published in proceeding or bookAcademic researchpeer-review

5 Citations (Scopus)

Abstract

Convolutional neural networks (CNN) are widely used in many NLP tasks, which can employ convolutional filters to capture useful semantic features of texts. However, convolutional filters with small window size may lose global context information of texts, simply increasing window size will bring the problems of data sparsity and enormous parameters. To capture global context information, we propose to use the self-attention mechanism to obtain contextual word embeddings. We present two methods to combine word and contextual embeddings, then apply convolutional neural networks to capture semantic features. Experimental results on five commonly used datasets show the effectiveness of our proposed methods.

Original languageEnglish
Title of host publicationWeb Information Systems Engineering – WISE 2018 - 19th International Conference, 2018, Proceedings
EditorsHye-Young Paik, Hua Wang, Rui Zhou, Hakim Hacid, Wojciech Cellary
PublisherSpringer-Verlag
Pages453-467
Number of pages15
ISBN (Print)9783030029210
DOIs
Publication statusPublished - 1 Jan 2018
Externally publishedYes
Event19th International Conference on Web Information Systems Engineering, WISE 2018 - Dubai, United Arab Emirates
Duration: 12 Nov 201815 Nov 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11233 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference19th International Conference on Web Information Systems Engineering, WISE 2018
Country/TerritoryUnited Arab Emirates
CityDubai
Period12/11/1815/11/18

Keywords

  • Attention mechanism
  • Convolutional neural networks
  • Text classification
  • Word representation

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this