Leveraging writing systems changes for deep learning based Chinese affective analysis

Rong Xiang, Qin Lu, Ying Jiao, Yufei Zheng, Wenhao Ying, Yunfei Long

Research output: Journal article publicationJournal articleAcademic researchpeer-review

Abstract

Affective analysis of social media text is in great demand. Online text written in Chinese communities often contains mixed scripts including major text written in Chinese, an ideograph-based writing system, and minor text using Latin letters, an alphabet-based writing system. This phenomenon is referred to as writing systems changes (WSCs). Past studies have shown that WSCs often reflect unfiltered immediate affections. However, the use of WSCs poses more challenges in Natural Language Processing tasks because WSCs can break the syntax of the major text. In this work, we present our work to use WSCs as an effective feature in a hybrid deep learning model with attention network. The WSCs scripts are first identified by their encoding range. Then, the document representation of the text is learned through a Long Short-Term Memory model and the minor text is learned by a separate Convolution Neural Network model. To further highlight the WSCs components, an attention mechanism is adopted to re-weight the feature vector before the classification layer. Experiments show that the proposed hybrid deep learning method which better incorporates WSCs features can further improve performance compared to the state-of-the-art classification models. The experimental result indicates that WSCs can serve as effective information in affective analysis of the social media text.

Original languageEnglish
Pages (from-to)3313-3325
Number of pages13
JournalInternational Journal of Machine Learning and Cybernetics
Volume10
Issue number11
DOIs
Publication statusPublished - 1 Nov 2019

Keywords

  • Affective analysis
  • Deep learning network
  • Writing system changes

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Cite this