Abstract
Affective lexicon is one of the most important resource in affective computing for text. Manually constructed affective lexicons have limited scale and thus only have limited use in practical systems. In this work, we propose a regression-based method to automatically infer multi-dimensional affective representation of words via their word embedding based on a set of seed words. This method can make use of the rich semantic meanings obtained from word embedding to extract meanings in some specific semantic space. This is based on the assumption that different features in word embedding contribute differently to a particular affective dimension and a particular feature in word embedding contributes differently to different affective dimensions. Evaluation on various affective lexicons shows that our method outperforms the state-of-the-art methods on all the lexicons under different evaluation metrics with large margins. We also explore different regression models and conclude that the Ridge regression model, the Bayesian Ridge regression model and Support Vector Regression with linear kernel are the most suitable models. Comparing to other state-of-the-art methods, our method also has computation advantage. Experiments on a sentiment analysis task show that the lexicons extended by our method achieve better results than publicly available sentiment lexicons on eight sentiment corpora. The extended lexicons are publicly available for access.
Original language | English |
---|---|
Article number | 7968355 |
Pages (from-to) | 443-456 |
Number of pages | 14 |
Journal | IEEE Transactions on Affective Computing |
Volume | 8 |
Issue number | 4 |
DOIs | |
Publication status | Published - 1 Oct 2017 |
Keywords
- Affective lexicon
- emotion
- regression
- sentiment
- word embedding
ASJC Scopus subject areas
- Software
- Human-Computer Interaction